Job Responsibilities:
Create crawlers for each source URL by using Python modules [scrapy, selenium, requests, BeautifulSoup and splash]
Create and maintain scrapy pipelines and middlewares to manage the output from the crawlers.
Create crawlers for all types of websites irrespective of the technical roadblocks.
Manage the crawlers to overcome technical challenges like IP ban, geolocation ban, captcha and bot blocking services.
Write SQL queries to manage database operations using Python modules like sqlalchemyDeploy the python scripts / crawlers to Linux based AWS servers
SkillsMust Haves:
Strong hands-on experience in Python programming
Good experience with scraping libraries such as Requests, BeautifulSoup, Selenium and Scrapy.
Proven 3+ experience working on web development frameworks such as Flask/Django/FastAPI/Tornado/PandasKnowledge of building APIs and services using REST.
Experience with any RDBMS and strong SQL knowledge.
Clear with Object-oriented concepts.
Excellent troubleshooting skills.
Proficient understanding of code versioning tools like git.
Basic understanding of front-end technologies, such as JavaScript, HTML5 and CSS3Understanding of fundamental design principles behind a scalable application.
Nice to have:
Familiar with UI frameworks like Angular/ReactJS
Strong unit test and debugging skills.
Strong knowledge of data engineering platforms such as Airflow.
Experience with the Google CloudDockerAgile and Scrum. Even better if have worked in Jira and confluence.
Experience with ETLCeleryExcellent interpersonal skills and ability to to work with a diverse team
Required: Technical Skills
https://jobs.lever.co/invisible/e3f21175-7605-4b30-8790-5ddbd97ab3b5