Saturday, April 13, 2024

social media convestions and aliens in the ether


Starting locally, I've notice how there are many different convestions about how people use different social media platforms (social networks, email, microblogging, etc) 


At one extreme, some people DM me on slack - this is annoying as, to save my sanity, i have turned off notifications on everything, and I look at different platforms with different frequencies - slack, mostly, once a day, compared to say, whatsapp (and signal and matrix), once an hour at least. While I don't use. teams for messaging, I know people who do, but they are signed on while at work all working hours, so that works ok for them.

At another extreme, some people only use a platform in broadcast mode, so an email list is flooded with "how do I leave this list" messages, or a whatsapp group is flooded with "please stop sending your messages to everyone" messages.


Which leads me to the global problem- without interoperability, we have to select a channel we use for a mode of use, and there are going to be lacunae, or indeed, black holes, and inter-galactic wastelands with no information at all


Which leads me to the universal problem and may be one explaination for Fermi's Paradox - we are hearing from alien's in the ether all the time, but most of them are using broadcast (as are we) and what happens to the shared spectrum when everyone broadcasts all the time? You get a descent into pure noise - indeed, we can work out that lots of alien's are NOT using broadcast otherwise we'd be subject to Olber's paradox, which is to say, the sky would be (modulo quantum limits) white noise from all the interfereing broadcasts. 

A slightly more advanced alien civilisation might think "aha, broadcast - shared spectrum, we need to employ collision detection, or even better, collision avoidance" just like Ethernet and WiFi do already on our planet. However, a little more thought would suggest that the protocol for this might suffer from rather high latency when waiting for a "Clear to Send" response to a "Request to Send" message over the light years.  So obviously smart aliens would do one of three things:

  • frequency division multiplexing - each civilisation gets a specific RF band to use
  • space or code division multiplexing - we develop really good collimaters or inter-stellar chipping sequences
  • cooperative relaying and power management - we place "cell towers" at convenient places (e.g. white holes and black holes) and then avoid interference by switching out of this universe (like cellular switching onto glass fiber networks, but in this case, interstellar wormholes).
The other thing these really smart aliens would do would be to prevent our wildly stupid RF reaching them at all by clever filtering. A "really clever filter" is a very big faraday cage, which could be built out of suitably designed dark matter. This is also why we don' see the white noise - we are in our own RF bubble. We are alone. All the clever people are the other side of the barrier. 

Sometimes they do visit us, but to avoid detection, they largely use obsolete social media platforms like MySpace and Orkut, where they can have a laugh.

Sunday, March 10, 2024

Witch Consumer Magazine, review of the leader boared top three LLMs "Conformité Ecologique" (the ubiquitous CE marque)

 



Witch Consumer Magazine, review of the leader boared top three LLMs "Conformité Ecologique"  (the ubiquitous CE marque)

We analyzed the CE claims of the following three large languish models, with respect to four key metrics for the Ecologique, as agreed in European law, namely enthalpy, internet pollution (measured in LoCS -- libraries of congress), bio-dediversification,and general contribution towards the heat death of the universe.

Currently, according to the boared, these are the top-of-the-heap in terms of hype-parameters:

The Faux Corperation's Pinocchio

Astravista's Libration

Sitting Duck's Nine Billion Dogma


We hired some prompt engineers to devise a suitably timely benchmark suite, and embedded the the three systems in our whim tunnel taking care to emulate all aspects of the open road to avoid any repeat of the folk's wagon farage.

Indeed, we used all three systems to design the whim tunnel, and compared the designs to within an inch of their lives until we were satisfied that this was a suitably level playing field on which to evaluate.


The benchmark suite will be made avaialble later, but for now, suffice it to say that we were able to exceed the central limit theorem requirements, so that our confidence is running high that the results are both meaningful, and potentially explainable, but certainly not excusable.


Enthalpy

Pinocchio

Pinocchio ran very hot, both during training and during every day use.

Libration

Libration was about half the temperature of Pinocchio

Dogma 

Roughly 12.332 times less than the next worst.


Pollution

Pinocchio

The Internet was worse off after this tool was used by 

approximately 3 LoCs

Libration

Again about a half as bad

Dogma

Was difficult to measure as the system never stabilised, but oscillated between getting worse,                     and then better, however,  the improvements were usually half the degradations.


de-diversification

Dogma

This was a shock - we expected better, but in fact the outcome was really rapid removal of                         variance.

Libration 

Around half as bad as Dogma

Pinocchio

very slightly less bad than Libration


Entropy

Libration

Excess use of Libration could bring the heath death of the  universe closer about 11 times faster                 than a herd of small children failing to tidy up their rooms

Pinocchio

Absurdly only 3x better than Libration.

Dogma

Appeared to gain from the Poppins effect, and generally ended up  tidier than before


Some critics have pointed out that Enhalpy and Entropy are two sides of the same coin, and pollution is likely simply the inverse of de-diversification, nevertheless,  we proceeded to evaluate all four in case later we might find different.

In general, none of these products meet the threshold for a CE mark, and for your health, and sanity, we strongly recommend that you do not use any of them, especially if you are in the business of prediction. Next week, we will review a slew of probablistic programming models with a special emphasis on the cleanliness of the Metropolitan Hastings line.


Monday, February 26, 2024

Towards International Goverance of AI

 I wonder what people are really thinking when they think of governance of Intelligence?

If we were considering human intelligence (which we are by extension) we better tread carefully, especially when considering who owns it. The ability to reason, creatively, to innovate is not really the same as any other thing we have sought governance over - 


nuclear weapons (test ban treaty, and pugwash convention)

spectrum allocation

orbits around earth

maritime&air traffic - fuels, tracking, control etc

recombinant DNA (asilomar conference

the weather (and interventions like geo-engineering e.g. see RS report on same)

what's similar about these, and what is different? 

Well we only have one go at each - there's a very countable human race, planet, sea, zombie apocalypse, climate emergency. we don't have time to muck about with variants of rules that apply to fungible material goods. We need something a tad more radical.

So how about this: A lot of AI is trained on public data (oxygen==the common crawl) - this is analgous to robber barons who enclosed the commons, then rented out the land to farmers to graze their cattle on, which used to be a free shared good...

A fix for this, and to re-align incentives is to introduce a Piketty style tax on the capital value of the AI - we could also just "re-nationalise" it, but typically, most people don't believe state actors are good at managing things and prefer to have faith in the invisible hand-  however, history shows that the invisible hand goes hand-in-glove with rich-get-richer, so a tax on capital (and as he showed in great detail in Capital in the 21st Century, it does not have to be a very high rate of tax to work), we can return the shared value of the AI to the common good.

A naive way to compute this tax might be to look at the data lakes the AI was trained on, although this may not all be available (since a lot of big AI companies add some secret sauce as well as free or appropriated ingredients) - so we can do much better by computing the entropy of the output of the AI.

A decent algorithm should produce very information rich output, compared to the size - e.g. a modern LLM with 100s of billions of dimensions, should produce short sentences or images which are highly instructuve - we can measure that, and tax the AI accordingly.

This should also mitigate the tendency to seek data without agreement or consent. 

I realise this may sound like a tax on recording media (back in the day, there were campaigns about "hope taping is killing the music industry"), but I claim there's a difference here in terms of the over-claimed, over-hyped "value add" that the AI companies assert - the real value was in the oxygen, public data, like birdsong or folk tunes, which should stay free or we die - in not being able to make it free, I suggest we do the next best thing and tax the rich. Call me old fashioned, but I think a capital value Piketty tax to mitigate rentiers is actually a new idea, and might actually work. We could call it VAIT.

Sunday, February 18, 2024

Government Procurement of Open Systems Interoperability or Open Source - a lesson for Digital Public Infrastructure

40+ years ago the US and European countries devised a government procurement policy which was to require suppliers to conform to Open Systems Interconnection standards - this was a collection of documents that could be used in RFP (request for proposals) to ensure that vendors bidding for government contracts to supply communications equipment, software, systems and even infrastructure would comply to standards that meant the government could avoid certain pitfalls like lockin, and monopolies of vendors arriving in the communications sectore.

It worked - we got the Internet - probably the worlds first digital public infrastructure provided both by public and private service providers, equipment and software vendors, and a great deal of open source software (and some hardware).

There's one review of how this evolved back in 1990 that represents an interesting transition point, from what were International Standards for Interconnection provided by the UN related organisation ISO or the ITU, to the Internet Standards, which were just about to come to dominate real world deployments - 1992 was a watershed point when the US research fudning agencies stopped funding IP infrastructure, and commercial ISPs very rapidly crystalised out of regional and national (and later, international) community run networks (where communities had been collaborations of research labs and universities funded by DARPA and NSF, or similar in Europe).

Why did the Internet Standards replace the ISO/ITU standards as the favourites in goverment procurement? It is hard to prove this, but my take is that they were significantly different in one simple regard - the specifications were matched with open source implementations. From around the early 1980s, one example was Berkeley Unix which included a rock solid TCP/IP software stack, funded by DARPA (derived from one at BBN (and required to be open source so others (universities, commerce and defense) could use and add to it as needed in the research programs of the 1980s, as actually happened. By 1992, just as the network went beyond government subsidy status, Berners-Lee released the first open source web server and browser (and specifications) and example sites boomed. Then we had a full fledged ecosystem with operational experience, compelling applications, and a business case for companies to join in to extend and make money, and governments to take advantage of rapidly improving technology, falling prices, and a wide choice of providers.

So in a competing world, standards organisations are just more sector, and customers, including some of the biggest cosumters, i.e. governments, can call the shots in who might win.

Now we face calls for Digital Public Infrastructures for other systems (open banking, digital identity being a cornerstone of that, but many others) and the question arises about how the governance should work for these.

So my conclusion from history is that we need open standards, we need government procurement to require interoperability (c.f. Europen Digital Markets Act requirement) and we need open source exemplars for all components to keep all the parties honest.

I personally would like to go further - I think AI today exploits the open availability of huge swathes of data to create new knowledge and artefacts. This too should be open source, open access, and required to interoperate - LLMs for example could scale much better if they used common structures and intermediate model formats that admitted of federation of models (and could even do so with privacy of training data if needed)...

We don;t want to end up with the multiple silos that we currently have in social media and messaging platforms, or indeed, the ridiculous isolation between video conferencing apps that all work in browsers using WebRTC but don't work with each other. This can all be avoided by a little bit of tweaking of government procurement, and some nudging using the blunt instrument of Very Large Contracts :-)

Saturday, February 17, 2024

mandatory foley sounds

you know it was suggested that EVs that are so beautifully silent, should be required to make a bit of fake engine or tyre noise just so pedestrians and cyclists are aware they are there.

but what is far more urgent is that we need people carrying phones they are staring at to do the same (oh, ok, maybe not revving diesel, or screeching rubber - maybe some other thing like belches, or farts or other human like sounds)....then if i'm cycling along, i know there's a stupid pedestrian who doesn't know I am there because they aren't looking before they step into the road. 

the phone could also emit a radio beacon to warn EVs to slam the brakes on.

or we could just let darwin play out...


oh, thinking about this, we could also imagine that the reason aliens have not been in touch with earthlings in all the 100 years we've been beaming out radio to them is that it is entirely possible that any sufficiently advanced civilisation has forgotten where the unmute button is.

Monday, February 12, 2024

explainable versus interpretable

 This is my explanation of what I think XAI and Interpretable AI were and are - yours may differ:-)


XAI was an entire DARPA funded program to take stuff (before the current gibberish hit the fan) like convolutional neural nets, and devise ways to trace just exactly how they worked - 

Explainable AI has been somewhat eclipsed by interpretable AI for touchy-feely reasons that the explanations that came out (e.g. using integrated gradients) were not accessible to lay people, even though they had made big inroads into shedding light inside the old classic "black box" AI - so a lot of stuff we might use in (e.g.) medical imaging is actually amenable to giving not just an output (classification/prediction) but also what features in the input (e.g. x ray, mri scan etc) were the ones, and indeed, what labelled inputs were specific instances of priors that led to the weights that led to the output.

Interpretable AI is much more about counterfactuals and showing from 20,000 foot how the AI can't have made a wrong decision about you because you're black, since the same input with a white person gives same decision......i.e. is narrative and post hoc, as opposed to mechanistic and built in...

It is this latter that is, of course, (predictably) gameable - the former techniques aren't, since they actually tell you how the thing works, and are attractive for other reasons (allow for reasoned sparsification of the AI's neural net to increase efficiency without loss of precision, and allow for improved uncertainty quantification,amongst other things an engineer might value)...

None of the post DARPA XAI approaches (at least none that I know of) would scale to any kind of LLM (not even Mistral 7B, which is fairly modest scale compared to GPT4 and Gemini) - so the chances of getting an actual explanation are close to zero. given they would struggle for similar reasons to deal with uncertainty quantification, the chances of them giving a reliable interpretation (I.e. narrative counterfactual reasoning) are not great (there are lots of superficial interpreters based around pre- and post- filters and random exploration of the state space via "prompt engineering" - I suspect these are as useful as the old Oracle at Delphi...("if you cross that river, a great battle will be won"), but I would enjoy being proven wrong!

For a very good worked example of explained AI, the DeepMind Moorfields retina scan NN work is exemplary - there are lots of others out there including use of the explanatory value to improve efficicency.

Sunday, February 04, 2024

standards and interoperable AI and the lesson from the early internet...

Back in the day (e.g. 1980), when we were deploying IP networks  there were a ton of other comms stacks around, from companies (DEC, IBM, Xerox etc) and from international standards orgs like ITU (was CCITT- X.25 nets) and ISO (e.g. TP4/CLNP). They all went away because we wanted something that was a) open, free, including code and documentation...

and

b) worked on any system no matter who you bought it from, whether very small (nowadays, think rasperrry pi etc) or very large (8000 core terabytes of ram, loads of 100s Gbps NICs etc), and 

c) co-existed in a highly federated, global scale system of systems.

So how come AI platforms can't be the same? We have some decent open source, but I don't see much in the way of interoperability right now, yet for a lot of global problems, we would like to federate, at coarse grain/large scale - e.g. for healthcare or environmental models or for energy/transportation so we get the benefit e.g. better precison/recall, longer prediction horizons,  more explainability, and, indeed, more sustainable AI, at the very least, since we wont all be running our own silos doing the same training again and again.

We should have an IETF for AI and an Interop trade show for AI and we should shun products and services that don't play - we could imagine an equivalent of what happened to europe and US GOSIP) Government Open Systems Interconnection Procurement) - which evolved into "just buy Internet, you know it makes sense, and it should be the law".

Blog Archive

About Me

My photo
misery me, there is a floccipaucinihilipilification (*) of chronsynclastic infundibuli in these parts and I must therefore refer you to frank zappa instead, and go home