Skip to main content

Sharpen your teeth in Data Analytics (and maybe create a portfolio while you’re at it)


 

As a novice data analyst myself, I truly understand the mind of another beginner rearing to go out in the world and trying to tame a wild data set and analysing the heck out of it. Understandable, it is then, the urge to pick up the same old Kaggle data set that is begging not to be analysed for the umpteenth time, all whilst arriving at the same insights as the 24,637 people that came before you. One must not be utterly taken aback then, when they see the face of their interviewer (seeing the same analysis thirteenth time that day) turn red in anger without application of any sort of conditional formatting.

So, one might naturally enquire into the remedy to such an ailment. I would then point you towards the less traversed yet endlessly fascinating direction of obscure publicly available datasets. From an exhaustive data set of all known passengers of the RMS Titanic to the largest reference data set of the human genome, not only do they make for remarkably interesting candidates for analytics projects, but they also set you apart in the eyes of interviewers. 

The variety of data out there is so diverse, every budding data enthusiast is bound to discover something that piques their analytical interest.


So here are some of my favourites that you can check out:
  1. Data.gov.in: Do your parents boast that they used to shop for a whole week’s worth of groceries all for 15 rupees in 2009? Bust out the last 15 years of Consumer Price index (CPI) data from the Government of India’s official data repository to prove them wrong, once and for all. Developed by the National Informatics Centre (NIC) under the aegis of Ministry of Electronics and Information Technology (aka MEITy), Data.gov.in has data from more than 6,00,000 resources including crime, judiciary, urban and sports. At this data heaven, everyone is sure to find something that would make a worthy addition to their portfolio. Not only this, but you can have access to a wide selection of their APIs as well. Bonus points for no sign ups necessary.

  2. Awesome public datasets (GitHub): From Swiss apartment models to the biggest crowdsourced database of American gut biome, ‘Awesome public datasets’ is a source of admittedly more global, yet no less amusing datasets which one can explore in search for their next project. These datasets were painstakingly collected and tidied from blogs and user responses. Most of them are absolutely free and part of the open source movement. Again, no obligation to sign up!

  3. Sindresorhus’s Awesome Collection (GitHub): This list, my friends! Is the GitHub equivalent to Sir Ravindra Jadeja because of the all-rounder variety of resources it holds. Not only is it home to learning resources ranging from fintech to Generative AI, it also holds free books, public datasets and much more. This list is a one stop shop to learn anything and everything! Do, however, make sure to have blinders on while you visit this page, otherwise you’re guaranteed to be distracted along the way (speaking from personal experience).

  4. Figshare: This one is for all the academically minded folks out there. Figshare has an endless trove of datasets from close to 25 categories ranging from economics to earth sciences. Be it China’s Covid-19 case data from January 2020 or the species of native plants in any state of the US, if you can think of it, this data repository probably has it. With a clean UX, it successfully distinguishes itself from the typical academic website, making it easier for a newbie to find his way around. The good news? You can download up to 20 GB of this data for FREE! (thank me later)

  5. Google Trends: Did you know that mentions of the term “big data” peaked in October 2018? Wanna know why? Then this is my homework for you to find out through Google’s own repository of all things trends and keywords. Alright! I must admit this one isn’t very “obscure” but deserves a mention, nonetheless. A pioneer in “nowcasting”, google trends is the back bone for all sorts of projects to get real time updates, the OECD’s weekly GDP tracker being a good example. Being the good Samaritans they are, they have an extremely helpful section right upfront to teach newbies how to make the most of this data as well.

  6. World Bank Open data: Wonder how the GDP of the nations of the world has changed over the past 30 years? No biggie, the World Bank has you covered! With it’s ‘World Bank Open Data’ initiative, it has made a true wealth of financial and fiscal data available to the masses. This data is available to download in CSV, XML and Excel formats along with access to their own data bank and thematic tables for easy understanding. All at one click of a button!

  7. OECD Data Explorer: In their own words, The Organisation for Economic Cooperation and Development (OECD) (phew) is an international organisation working towards making better policies for better lives. But they’re not all talk, they’ve made available data ranging from Tobacco consumption, Variation in Body weight between nationalities and Wildfires, to one and all. This excellent selection of data can help you analyse everything from levels of alcoholism between states in India to the variation of occurrence of obesity within the country. Truly a great way to spend one’s Saturday, don’t you think? (just kidding)

  8. UCI Machine learning repository: Focused towards machine learning enthusiasts, this is an excellent repository of more than 600 datasets for all the newbies trying to get themselves familiar with ML. This will ensure that you go from being an ML clueless to an ML connoisseur in no time!

So, I hope that equipped with these sources, you will make your portfolio stand out like a kangaroo in a penguin enclosure. Do always remember the wise words of Franklin D. Roosevelt, “The only thing we have to fear is fear itself, and maybe not backing up important data” (don’t quote me on that though).

Till we meet again, Data comrades!
 
About Author
Author Photo
Vasudev Pandey
I am a budding data scientist and mechatronics engineer with a passion for history and finance. I write about anything and everything I find interesting.

Comments

  1. Thank you for sharing these public datasets with the TakeOff Talent community, Vasudev. This will surely help many people who are confused around how to build a portfolio in analytics as well as in data science.

    ReplyDelete
  2. great article, thank you so much for sharing vasudev

    ReplyDelete
  3. Thanks Vasu...very helpful

    ReplyDelete
  4. This will surely help the DS community. - Dan

    ReplyDelete

Post a Comment

Please feel free to share your thoughts and discuss.

Other popular job openings

Clarivate is hiring for a fresher entry level Data Analyst job in India

Position: Data Analyst Company: Clarivate Location: Karnataka, India Job Type: Full-time Job Mode: Hybrid / Remote Job Requisition ID: JREQ128763 Years of Experience: 0-3 years Company Description : Clarivate is a prominent global leader in the realm of transformative intelligence. The organization is dedicated to offering a comprehensive suite of enriched data, actionable insights, analytics, workflow solutions, and specialized services. Their operational focus spans various sectors, notably Academia & Government, Intellectual Property, and Life Sciences & Healthcare. With a workforce exceeding 10,000 employees, Clarivate operates on a vast scale, fostering innovation and excellence within its services. The company thrives on a mission to empower its clients through reliable intelligence that drives decision-making and enhances productivity. Clarivate's expertise extends to providing essential data that supports the advancement of scientific research, intellectual property

Fractal is hiring for a fresher entry level data analyst (google analytics) job in India

Position: Google Analytics Specialist Company: Fractal Analytics Location: Bengaluru, India Job Type: Full-time Job Mode: On-site Job Requisition ID: SR-27334 Years of Experience: 0-3 years Company Description Fractal Analytics is one of the leading global companies in the Artificial Intelligence (AI) space. The company’s mission is to support every enterprise decision with AI-powered insights and solutions. With over 3,000 employees across 16 countries, Fractal serves many of the world’s largest and most admired organizations, including Fortune 500 companies. The company operates in regions including the United States, UK, India, Singapore, Ukraine, and Australia, among others. Fractal has consistently been recognized for its work environment and company culture, having been named one of India's top companies to work for by the Great Place to Work® Institute. In addition to being a leader in AI, Fractal has been acknowledged for its capabilities in customer analytics, computer vis

Standard Chartered is hiring for a fresher entry level Data Analyst job in India

Position: Analyst, Data Analysis Company: Standard Chartered Location: Bangalore, India The role is also available in Singapore and Malaysia. Job type: Regular Employee Job mode: Hybrid Working (combination of home and office locations) Job requisition id: 9091 Years of experience: Not explicitly mentioned, but suitable for professionals with relevant skills and education. (0-3 years) Company description: Standard Chartered is a leading international banking group that operates across several markets, with over 170 years of presence in the financial sector. The organization’s core mission is to drive commerce and prosperity through the power of its unique diversity. Their aim is to bring positive change by challenging the status quo and fostering innovation. As a company, Standard Chartered is dedicated to making a significant impact for clients, employees, and communities. The company has a dynamic and inclusive environment that encourages collaboration and values diversity. With a st

Cognizant is hiring for a Associate Data Scientist role in India

Position: Associate Data Scientist Company: Cognizant Technology Solutions Location: Hyderabad, India Job Type: Full-time Job Mode: Hybrid (Remote and In-Office Work) Job Requisition ID: 00060793511 Years of Experience: Entry-level position, typically requiring 0–3 years of experience in data science or a related field. Company Description: Cognizant is a globally recognized leader in professional services, offering cutting-edge solutions to help businesses modernize their operations. As a Nasdaq-100 listed company (CTSH), we focus on assisting organizations in adopting the latest technologies, reimagining business processes, and transforming customer experiences. Our clients range across industries, and we aim to provide them with innovative ways to thrive in a rapidly evolving digital landscape. With over 345,000 associates worldwide, Cognizant is a truly global community that emphasizes collaboration, inclusion, and innovation. We are proud of our energetic and supportive workplace

Syneos Health is hiring for a fresher entry level data scientist job in India

Position: Data Scientist II Company: Syneos Health Location: India – Asia Pacific – Home-Based Job Type: Full-time Job Mode: Remote Job Requisition ID: 24003966 Years of Experience: 0-3 yeaars Company Description: Syneos Health® is a premier organization specializing in fully integrated biopharmaceutical solutions, aimed at accelerating the success of its customers. With operations spanning over 110 countries, Syneos Health operates as a global leader in the biotechnology and pharmaceutical sectors, harnessing a team of more than 10,001 employees. Syneos Health combines unique clinical, medical, and commercial insights to deliver data-driven solutions that align with the complexities of the modern healthcare market. The company's mission is to enhance healthcare outcomes by using cutting-edge technologies, business practices, and a deep understanding of patient and physician behavior. Syneos Health prides itself on fostering a culture that values diversity, equity, and inclusion. T

Amazon is hiring for an entry level Sales Data Analyst job in India

Position: Sales Analyst Company: Amazon Location: Bengaluru, Karnataka, India Job Type: Full-time Job Mode: On-site Job Requisition ID: 2790132 Years of Experience: 1-3 (internship experience also works) Company Description: Amazon is a leading global technology company that focuses on e-commerce, cloud computing, digital streaming, and artificial intelligence. Established in 1994 by Jeff Bezos, Amazon started as an online bookstore and has since expanded its offerings to include various products and services. The company is headquartered in Seattle, Washington, and operates numerous fulfillment centers worldwide. Amazon is committed to providing a wide selection of products, competitive pricing, and convenience for its customers. The company emphasizes innovation and customer-centric solutions, continuously striving to enhance the shopping experience. Amazon's culture promotes collaboration, diversity, and inclusion, making it a sought-after workplace for professionals across vari

Infoedge (Naukri.com) is hiring for a fresher entry level Data Analyst job

Position: Data Analyst Company: Info Edge India Ltd. Location: Noida, Uttar Pradesh, India Job type: Full-time Job mode: On-site Job requisition ID: Not specified Years of experience: Entry level (0-2 years) Company Description: Info Edge India Ltd. is one of India’s leading internet-based organizations that operates across various online platforms, offering a range of services aimed at addressing consumer needs in recruitment, real estate, matrimony, and education. The company's most recognized platform is Naukri.com , India's top job portal, commanding over 75% of the market share. Jeevansathi.com is a matrimonial service focused on helping individuals find life partners, while 99acres.com caters to real estate requirements, being one of the largest property marketplaces in India. In addition to these, Shiksha.com helps students in their pursuit of higher education by providing information about courses and institutions. The company is committed to fostering innovation an

Amgen is hiring for a fresher entry level Associate Data Scientist job role in India

Position: Associate Data Scientist Company: Amgen Location: Hyderabad, India Job Type: On-site Job Mode: Full-time Job Requisition ID: R-197353 Years of Experience: 0 to 3 years (diploma holders with some experience are also accepted) Company Description : Amgen, founded in 1980, is a leading biotechnology company committed to serving patients suffering from serious illnesses. The company’s mission focuses on pioneering innovations in biotech to combat the world's toughest diseases, particularly in four key therapeutic areas: Oncology, Inflammation, General Medicine, and Rare Disease. Each year, Amgen impacts the lives of millions of patients globally through its research, manufacturing, and delivery of groundbreaking medications aimed at enhancing the quality of life. The organizational culture at Amgen is distinguished by its collaborative spirit, innovative approaches, and a strong scientific foundation. Employees are encouraged to embrace challenges and view them as opportuniti

Merkle (part of dentsu group) is hiring for a fresher entry level Data Analyst job in India

Position: Analyst Company: Merkle Location: Pune, India Job Type: Full-time Job Mode: On-site Job Requisition ID: R1073033 Years of Experience: 0–3 years (Entry-level position) Company Description: Merkle, part of the globally renowned dentsu network, is a leader in the realm of data-driven marketing and personalized customer experiences. With over two decades of industry expertise, Merkle assists businesses in connecting with their audiences through innovative strategies that merge technology, data analytics, and creativity. We offer comprehensive solutions that help brands stay relevant in a rapidly evolving digital landscape, crafting customer journeys that lead to impactful engagements and long-term loyalty. The company is deeply committed to sustainability, ensuring that we play a significant role in shaping a greener future. Our efforts are aligned with both client needs and industry best practices, helping organizations grow in a way that balances profitability with environmenta

JLL is hiring for a fresher entry level Data Analyst

Position: MIS/Data Analyst Company: Jones Lang LaSalle (JLL) Technologies Location: On-site, Hyderabad, Telangana (TS) Job Type: Full-time Job Mode: On-site Job Requisition ID: REQ385930 Years of Experience: 0-3 years Company Description: Jones Lang LaSalle (JLL) and JLL Technologies are global leaders in real estate and investment management services, dedicated to shaping the future of real estate to create a better world. With a strong commitment to excellence, JLL empowers its employees to thrive by offering an inclusive, entrepreneurial environment where they can grow meaningful careers. The company values innovation and is dedicated to utilizing advanced technologies to provide world-class services and advisory solutions to clients worldwide. Whether you're experienced in commercial real estate, technology, or related industries, JLL offers opportunities to apply your expertise in a collaborative environment that prioritizes diversity and inclusion. JLL's mission extends b