The Future of APIs: APIs aren’t the endgame they won't stay forever

  • I got about halfway down, and it suddenly started sounding like the Semantic Web reincarnated as an API service. This idea crops up about once every 10 years (in some form or another), and it runs into fairly predictable problems. There are two good essays that I recommend that people read before getting too excited about how well machines can interoperate without humans in the loop.

    Shirky's "The Semantic Web, Syllogism, and Worldview" http://www.shirky.com/writings/herecomeseverybody/semantic_s...

    Doctorow's "Metacrap: Putting the torch to seven straw-men of the meta-utopia" http://www.well.com/~doctorow/metacrap.htm

    Doctorow talks about problems with metadata, but these problems might apply equally to APIs and the API vocabulary discussed in the article. Specifically:

    2.1 People lie 2.2 People are lazy 2.3 People are stupid 2.4 Mission: Impossible -- know thyself 2.5 Schemas aren't neutral 2.6 Metrics influence results 2.7 There's more than one way to describe something

    The fundamental problems are that (1) getting people to agree on things is a surprisingly difficult and political problem that can never be solved once and for all, and (2) people have incentives to lie. If you invent a generalized way to look up any weather forcasting API, somebody is going to realize that they can make money gaming the system somehow. PayPal is really in the business of fraud detection, and Google is in the business of fighting against blackhat SEO (and click fraud).

    So take your automated API discovery utopia, and explain to me what happens when blackhats try to game the system and pollute your vocabulary for profit. Tell me what will happen when 6 vendors implement an API vocabulary, but none of them quite agree on the corner cases. This is the hard part.

  • Here's the part I find most humorous. Machine to Machine communication has already been attempted, and it always fell back to requiring human intervention. Web Services Description Language (WSDL) was an attempt to exactly this, and it failed.

    WSDL didn't fail because it frequently tried to describe SOAP connectivity (and everyone knows that SOAP is obviously bad for all things), it failed because it was still people writing the APIs and the descriptions of those APIs. And since people aren't perfect, humans had to intervene to find and fix the bugs to properly communicate with a WSDL defined API.

    Until AI gets good enough, or we adopt a specific definition to meet all use cases (should be fun to watch), such attempts are going to keep failing. Because it's humans hiding in the box of the turk, and will be for the forseeable future, and computers are still pretty terrible at communicating with humans.

  • There's some very good info here towards the end, but the first half of the blog post made me wonder if they're going to get to it.

    Perhaps this is just a function that it's a marketing post designed to simultaneously appeal to different audiences while explaining the problem to decision makers that are unfamiliar with what problem they're trying to solve. I sympathize, but as a designer acutely familiar with problems around API discovery, the first half was an extremely cringey read.

    Anyway, you quoted all the right sources (save for Tim Berners-Lee's Semantic Web and Giant Global Graph), and I wish you much luck, but I think you're aware that this was tried before [1][2], where much less human interaction and intervention was required, and it nonetheless faltered. "Complexity" was a scapegoat at the time, and I think that's an unsatisfactory, almost too convenient of an answer, so how do you avoid that same fate?

    [1] https://en.wikipedia.org/wiki/Web_Services_Discovery#Univers... [2] https://en.wikipedia.org/wiki/Web_Services_Description_Langu...

  • As someone who worked on a whole lot of integration projects in the recent past, I think network APIs can be "solved", but not by the means this article describes.

    JSON-LD looks like a reimplementation of something that was already done by XML, XML Schemas and WSDL. If several technologies that were designed to be semantic failed at automating network API integration, why would you think a sub-format for JSON will succeed? What does it do differently?

    To really solve the problem of service integration we need to rethink our approach to "services" altogether.

    One solution that I see would involve a global registry of semantic symbols (e.g. "temperature", "location", "time") and a constraint solver. So yeah, distributed Prolog on the global scale. Systems would exchange constraints until they reach a mutually agreeable solution or fail. Then the derivation tree would be used to generate a suitable protocol. While I think this is possible, I don't think there is any real interest in stuff like this right now.

  • There's an economical problem with the proposed direction of development.

    API providers are typically businesses or other actors whose interest is to lock the API clients nto their service. What would they gain by making their API interoperable with their competitors?

    That is only viable for newcomers into the type of service and only works if they clone the API of an established player, not improve on it and standardize it.

  • This would be all good, if the goal was to have computers talk to other computers. In real life, typically organisations start talking to other organisations and _maybe_ there will be computers involved eventually. Most of the integration complexity is building a technical and functional clutch so two organisations can talk but do not leak too much (dynamic) complexity across their boundaries. And that does not lend itself well to automation.

  • Hold up an object in front of three people. Ask all three to describe the object and what it can do. You'll get three different answers.

    The fundamental difficulty with APIs is that they force clients and servers to use the same domain model. Humans are necessary because only humans have the intelligence to reconcile differences in domain models.

    The idea of inventing some kind of discovery language is just deferring the difficult work of reconciliation. Computers will need to be as good at induction as the human brain before it will be possible to eliminate humans from the process.

  • I am extremely skeptical that autonomous APIs are possible without 90% of full natural language processing. Whatever we do to make APIs have their purposes be self-documenting, there will still be inferential gaps between what we say explicitly and what we mean.

    We could adopt a highly rigid language describing what it is that an API provides and what purposes that data is useful for, but that's restrictive and very brittle, especially against Silicon Valley's favorite activity of disrupting established ways of doing things. Like going from proofs in ZFC to proofs in first-order logic, we can make it more stable but only at the cost of losing lots of power and expressiveness.

  • I'm not sure this goal is very practical, even in the toy example you used (being able to swap data sources for weather forecasts).

    If you can use a common vocabulary to access multiple APIs, that requires that all APIs implement the same feature set. Which means getting the API sources to agree on the features to implement, and how to describe them, and stop them from adding any features on that the others don't have. But of course, they'll all be motivated to add their own features, to distinguish themselves from their competition.

    And once a API consumer is using a feature that other API producers don't support, then the consumer is locked into that producer, and the whole shared vocabulary is for naught. And of course the API consumers will be looking for additional features, because those translate into features that they can offer to their customers.

    Basically, this requires API producers to work together to hobble their ability to meet their customers' needs, all to make it easier for their customers to drop them for a competing endpoint. So it looks like a net negative for everybody.

  • APIs usually create a master/slave relationship. The strong party gets to define the API, and the weak party has to adapt to it. There are few fully symmetrical APIs.

    Usually the seller defines the API, but where the buyer is more powerful, the buyer sometimes does. See, for example, General Motors' purchasing system for suppliers. WalMart has something similar. There, the seller must adapt to the buyer's system.

    There are a few systems where there are interchange standards good enough to allow new parties to communicate as peers without a new implementation. ARINC does this for the aviation industry.

    We have yet to develop systems where both sides enter into communication and figure out how to talk. This is needed. XML schemas were supposed to help with that, but nobody used them that way.

  • - CORBA has service discovery and interface definitions.

    - SOAP has service discovery and interface definitions.

    - SOA has service discovery and interface definitions.

    Some of these are like over 20 years old. They also included many other features. I would not describe this as being "the future".

  • Can we avoid replicating the tarpit that is WSDL for RESTful services? Time will tell, but I have my doubts.

  • I think M2M is great and all, but the thing is, the machines are doing something for the humans. We can tell a machine to go talk to another machine but neither machine will know WHAT they are supposed to do unless we tell them. The HOW is certainly something that can be solved with time and uniformity, but the WHAT is always something that will require a human presence. And ultimately, with all the humans in the world needing different WHATs, I don't see the required uniformity ever coming to be on any scale larger than the local central authority.

  • The only prediction I have regarding autonomous API:s is: neuronstreams. If an API exposes a set of input channels and output channels where data can be sent between neural nets, they will be able to adapt to whatever format they see fit without having to make sense to us humans.

    It is a scary thought but if we want to remove humans in this process we shouldn't even be able to understand the communication.

  • (first post)

    Howdy,

    This thread is making me consider dusting off a compiler that I wrote for a language that I created for designing APIs. That’s because I strongly agree that lack of versioning in many client/server architectures makes it difficult for devs to evolve their codebases. So, in this language I designed, the versioning of changes is a core concept.

    When a server offers an API which can potentially deal with different types of clients, or with clients that need stability, then versioning is a must to have a chance for a sane codebase. Versioning allows the natural evolution of the API, while maintaining compatibility with existing clients.

    Out of curiosity, if I were to bring the codebase up to date (C++), and make it downloadable, installable, and usable for free/open-source, maybe for Linux and Windows, would anyone be interested in contributing to a kickstarted for that?

    regards,

    Vlad

  • It's not enough to just have API's published unidirectionally, if you want the system to evolve into something optimally fit for a particular job.

    Think of layers in a convolutional neural network, for example. Each layer of neural units provides information to the next layer, but fixing the output of the higher layers limits the trainability and ultimate accuracy of the trained network. In order to maximize fitness, full backpropagation (or similar) is needed, with all layers being trained.

    What's needed for self-negotiated API's is a generalization of the CNN model (or similar) into a variable-length serial communication format. Humans would define a fitness function either explicitly or implicitly by interacting with the system, and the self-negotiating API system would use some many-parameter optimization algorithm to alter both the Server and Client(s) and maximize the total fitness.

  • Since learning it I've thought that REST is really aimed at humans. It's all well being able to navigate state but unless it knows what to do with a given state beforehand it's not much use to a machine, other than perhaps to gather data, in which case that data won't mean anything unless an intelligence interprets it.

    Since written code will only ever do a defined set of operations most web Apis are fine being written in RPC style.

    The only time I've advocated a RESTful approach is when there was a lot of public data being exposed and human developers may well have explored the data.

    When an AI can navigate an interpret data that it hasn't seen before then things will get interesting

  • Unless somebody does it well, and it catches, we are going to repeat again and again reinventing UDDI, semantic web services and other past intents...

  • It's sad that nothing has been learned from the past 40 years , so instead of building on the good parts of say ASN.1, amateurish protocols like json is invented, solving absolutely nothing of the hard bits, while improving some superficial readability problems.

  • I think it's difficult to realize what is gonna be solved by AGI (https://www.wikiwand.com/en/Artificial_general_intelligence) and what not.

    Regarding APIs that understand each other, it might very well be too much in the direction of AGI.

    That means that if we want to solve this we have to look at research that allow AIs to understand each other.

    * Imitation learning

    * Language grounding

    The latter is also known more abstractly as the symbol grounding problem (https://www.wikiwand.com/en/Symbol_grounding_problem) and led to many debates in history. A collection of APIs seems useful, have them interact with each other - by getting the human out of the loop - might be a lofty but unattainable goal.

  • why is Google search lousy for API discovery? If I search for email sending Api, I get results for MailJet and SendGrid - relevant ones. If I search for entity identification api, I get AlchemyApi back - also relevant.

  • I think "using APIs" is AI-complete, thus making "autonomous APIs" a pipe dream until we have an AGI, and at that point APIs are not that interesting anymore.

  • This is my first time reading about JSON-LD, but it just sounds like a translation layer to map someone else's keys into my keys (or vice versa). Does this really get me a whole lot?

    It seems like protocol buffers was made to solve a lot of the problems brought up here. GraphQL types and schemas seem to go a long way towards this as well.