Generative Teaching Networks: Accelerating Neural Architecture Search

  • I made a video explaining this research if you are interested: https://www.youtube.com/watch?v=lmnJfLjDVrI&t=4s

  • Isn't it interesting what is more efficient: neural nets or a learning Mealy machine? Anyway, an optimization of an exhaustive search is a slow but assured way of solving a car driver problem. You don't need the most accurate simulation for it as Elon says here:

    https://www.youtube.com/watch?v=Ucp0TTmvqOE&t=7358

    A "brute-force" algorithm (an exhaustive search, in other words) is the easiest way to find an answer to almost any engineering problem. But it often must be optimized before being computed. The optimization may be done by an AI agent based on Neural Nets, or on a Learning Mealy Machine.

    A Learning Mealy Machine is an finite automaton in which training data stream is remembered by constructing disjunctive normal forms of the output function of the automaton and the transition function between its states. Then those functions are optimized (compressed with losses by logic transformations like De Morgan's Laws, arithmetic rules, loop unrolling/rolling, etc.) into some generalized forms. That introduces random hypotheses into the automaton's functions, so it can be used in inference. The optimizer for automaton's functions may be another AI agent, or any heuristic algorithm, which you like...

    Some interesting engineering (and scientific) problems are: - finding a machine code for a controller of a car, which makes it able to drive autonomously; - finding a machine code for a controller of a bipedal robot, which makes it able to work in warehouses and factories; - finding a CAD file, which describes the design of a spheromak working with a guiding center drift generator (hypothetical device, idk!); - finding a CAD file, which describes some kind of working Smoluchowski’s trapdoor (in some specific conditions, of course); - finding a file, which describes an automaton working in accordance to the data of a scientific experiment; - finding a file, which describes manufacturing steps to produce the first molecular nanofactory in the world.

    Related work by Embecosm is here: superoptimization.org Though it seems people have superoptimized only tiny programs so far as you can see from the ICRL 2017 paper (App. D): arxiv.org/abs/1611.01787 And loops can also be rolled, not just unrolled. That kind of loop optimization seems to be absent here: en.wikipedia.org/wiki/Loop_optimization

    If you have any questions, ask me here: https://www.facebook.com/eugene.zavidovsky

  • For those interested, I also work in this area: https://medium.com/capital-one-tech/why-you-dont-necessarily...

    Arguably, this is still a new field — but IMO will eventually become standard practice. Imo you can completely separate humans from data and still do machine learning (likely analytics). This would dramatically limit data breaches if implemented properly.

  • I really like the idea of optimising the 'direct' training data, and wonder how it would interact with the use of synthetic data as the 'indirect' training data. Or perhaps some sort of restriction on the (optimised) 'direct' training data as a form of regularisation. Lots of potential ideas to explore here.

  • Not my area of expertise. Is the innovation here searching over generated training examples that appear optimize training efficiency/rate of learning for the target task?

  • Does this technique potentially allow training on smaller datasets? I am thinking in application with neuroimaging datasets, which are usually numbered in the hundreds.