Show HN: The Java Web Scraping Handbook

  • Hey Hacker News,

    Today Pierre and I are releasing the Java Web Scraping Handbook for FREE!

    And by free we mean you don't even have to give us your email address!

    Some backstory about the book: I originally wrote it in 2018 after working in different web scraping projects for startups (Mint.com like) and banks.

    The first four chapters are language agnostic, and the last can be applied to any language, so don't be scared if you don't know Java!

    By the end of the book, you will know:

    - How to scrape any website

    - Just enough XPath / Regex / DOM knowledge to be dangerous.

    - How to deal with Javascript-heavy websites (Single Page application...)

    - How to programmatically perform actions on a website behind a login form

    - Parse information inside PDFs

    - Bypass captchas

    - Deploy your scrapers in the cloud

    I'm happy to answer any questions about the book :)