Show HN: Census – Export for data warehouses

  • Brad from Census (https://www.getcensus.com) here - I wanted to add some technical detail for the HN crowd.

    On its face, a Census workflow is a simple "program" that our users author in a point-and-click manner – read some data from System A and broadcast it to Systems B, C, and D. But to "compile" that program, Census has to determine:

      - Which APIs to use (bulk, streaming, etc)
    
      - The schema of each destination (which can change out from underneath us at run time!)
    
      - The semantics of reads and writes from each system (atomicity, isolation, how roll back if a write fails)
    
      - How to map data with high-fidelity across strongly-, weakly-, and dynamically-typed data stores
    
    That's just compilation – then we need to execute that compiled plan and move massive amounts of data with low latency and high throughput, all while handling byzantine failures in source and destination systems and automatically rolling back, recovering, or helping users "debug" their workflows when things go wrong.

    There's a lot of depth to this (and we haven't "solved" it by any means) - happy to answer questions here or at brad@getcensus.com if you have them!

  • Excited to follow your progress! I view this problem as the one of the biggest gaps in today's "Cloud Data Ecosystem". Tools like Stitch and Fivetran make it super easy to extract data from source systems; next-gen cloud data platforms like Snowflake make storing, transforming, and querying that data a breeze (especially with the help of tools like dbt and dataform); and there are a ton of powerful and easy to use BI tools for visualizing and digesting that data. But the minute you need to send that data to other systems, it's back to painful, failure-prone, and mind-numbing scripts.

  • Nice, congrats on launching! I'm excited to try it out.

    Curious - what if there's a transformation that happens with data from the data warehouse but can't be performed with SQL (such as a Python script)? Is there a way to send that data back into the integrations that you support? Or would be it best to push back into the data warehouse and use Census from there?

    Aka, can you only transform with Census using SQL? Or other languages as well?

  • When can we create an account with email/password instead of Google?