Skip to main content

Sharpen your teeth in Data Analytics (and maybe create a portfolio while you’re at it)


 

As a novice data analyst myself, I truly understand the mind of another beginner rearing to go out in the world and trying to tame a wild data set and analysing the heck out of it. Understandable, it is then, the urge to pick up the same old Kaggle data set that is begging not to be analysed for the umpteenth time, all whilst arriving at the same insights as the 24,637 people that came before you. One must not be utterly taken aback then, when they see the face of their interviewer (seeing the same analysis thirteenth time that day) turn red in anger without application of any sort of conditional formatting.

So, one might naturally enquire into the remedy to such an ailment. I would then point you towards the less traversed yet endlessly fascinating direction of obscure publicly available datasets. From an exhaustive data set of all known passengers of the RMS Titanic to the largest reference data set of the human genome, not only do they make for remarkably interesting candidates for analytics projects, but they also set you apart in the eyes of interviewers. 

The variety of data out there is so diverse, every budding data enthusiast is bound to discover something that piques their analytical interest.


So here are some of my favourites that you can check out:
  1. Data.gov.in: Do your parents boast that they used to shop for a whole week’s worth of groceries all for 15 rupees in 2009? Bust out the last 15 years of Consumer Price index (CPI) data from the Government of India’s official data repository to prove them wrong, once and for all. Developed by the National Informatics Centre (NIC) under the aegis of Ministry of Electronics and Information Technology (aka MEITy), Data.gov.in has data from more than 6,00,000 resources including crime, judiciary, urban and sports. At this data heaven, everyone is sure to find something that would make a worthy addition to their portfolio. Not only this, but you can have access to a wide selection of their APIs as well. Bonus points for no sign ups necessary.

  2. Awesome public datasets (GitHub): From Swiss apartment models to the biggest crowdsourced database of American gut biome, ‘Awesome public datasets’ is a source of admittedly more global, yet no less amusing datasets which one can explore in search for their next project. These datasets were painstakingly collected and tidied from blogs and user responses. Most of them are absolutely free and part of the open source movement. Again, no obligation to sign up!

  3. Sindresorhus’s Awesome Collection (GitHub): This list, my friends! Is the GitHub equivalent to Sir Ravindra Jadeja because of the all-rounder variety of resources it holds. Not only is it home to learning resources ranging from fintech to Generative AI, it also holds free books, public datasets and much more. This list is a one stop shop to learn anything and everything! Do, however, make sure to have blinders on while you visit this page, otherwise you’re guaranteed to be distracted along the way (speaking from personal experience).

  4. Figshare: This one is for all the academically minded folks out there. Figshare has an endless trove of datasets from close to 25 categories ranging from economics to earth sciences. Be it China’s Covid-19 case data from January 2020 or the species of native plants in any state of the US, if you can think of it, this data repository probably has it. With a clean UX, it successfully distinguishes itself from the typical academic website, making it easier for a newbie to find his way around. The good news? You can download up to 20 GB of this data for FREE! (thank me later)

  5. Google Trends: Did you know that mentions of the term “big data” peaked in October 2018? Wanna know why? Then this is my homework for you to find out through Google’s own repository of all things trends and keywords. Alright! I must admit this one isn’t very “obscure” but deserves a mention, nonetheless. A pioneer in “nowcasting”, google trends is the back bone for all sorts of projects to get real time updates, the OECD’s weekly GDP tracker being a good example. Being the good Samaritans they are, they have an extremely helpful section right upfront to teach newbies how to make the most of this data as well.

  6. World Bank Open data: Wonder how the GDP of the nations of the world has changed over the past 30 years? No biggie, the World Bank has you covered! With it’s ‘World Bank Open Data’ initiative, it has made a true wealth of financial and fiscal data available to the masses. This data is available to download in CSV, XML and Excel formats along with access to their own data bank and thematic tables for easy understanding. All at one click of a button!

  7. OECD Data Explorer: In their own words, The Organisation for Economic Cooperation and Development (OECD) (phew) is an international organisation working towards making better policies for better lives. But they’re not all talk, they’ve made available data ranging from Tobacco consumption, Variation in Body weight between nationalities and Wildfires, to one and all. This excellent selection of data can help you analyse everything from levels of alcoholism between states in India to the variation of occurrence of obesity within the country. Truly a great way to spend one’s Saturday, don’t you think? (just kidding)

  8. UCI Machine learning repository: Focused towards machine learning enthusiasts, this is an excellent repository of more than 600 datasets for all the newbies trying to get themselves familiar with ML. This will ensure that you go from being an ML clueless to an ML connoisseur in no time!

So, I hope that equipped with these sources, you will make your portfolio stand out like a kangaroo in a penguin enclosure. Do always remember the wise words of Franklin D. Roosevelt, “The only thing we have to fear is fear itself, and maybe not backing up important data” (don’t quote me on that though).

Till we meet again, Data comrades!
 
About Author
Author Photo
Vasudev Pandey
I am a budding data scientist and mechatronics engineer with a passion for history and finance. I write about anything and everything I find interesting.

Comments

  1. Thank you for sharing these public datasets with the TakeOff Talent community, Vasudev. This will surely help many people who are confused around how to build a portfolio in analytics as well as in data science.

    ReplyDelete
  2. great article, thank you so much for sharing vasudev

    ReplyDelete
  3. Thanks Vasu...very helpful

    ReplyDelete
  4. This will surely help the DS community. - Dan

    ReplyDelete

Post a Comment

Please feel free to share your thoughts and discuss.

Other popular job openings

IBM is hiring for a fresher entry level Data Scientist role in India

Position: Data Scientist - Artificial Intelligence Company: IBM India Private Limited Location: Bangalore, Karnataka, India Job Type: Regular Job Mode: Hybrid Job Requisition ID: 15375 Years of Experience: Not specified; as per JD looks like a fresher entry level job; 0-3 years Company Description: IBM is a globally recognized leader in technology and consulting services. Established in 1911, IBM has been at the forefront of innovation, contributing to advancements in artificial intelligence, quantum computing, and hybrid cloud solutions. IBM Consulting is a key division of the company, providing business and technology transformation services to some of the most valuable companies worldwide. With a strong focus on collaboration and strategy, IBM Consulting helps organizations optimize their digital transformations. The company is dedicated to responsible technology usage, striving to create a positive impact on society and the planet through intelligent solutions and sustainable busin...

KPMG is hiring for a fresher entry level Associate Consultant - Power BI role in India

Position: Associate Consultant - Power BI Company: KPMG India Location: Bangalore, Karnataka, India Job Type: Full-time Job Mode: On-site Job Requisition ID: INTG10026723 Years of Experience: Not explicitly mentioned; as per JD looks like a fresher entry level role; 0-3 years Company Description: KPMG India is a leading professional services firm affiliated with KPMG International Limited. Established in August 1993, KPMG India has been a key player in providing consulting, audit, and advisory services. The company operates across multiple locations in India, including Ahmedabad, Bengaluru, Chandigarh, Chennai, Gurugram, Hyderabad, Jaipur, Kochi, Kolkata, Mumbai, Noida, Pune, Vadodara, and Vijayawada. Leveraging its extensive global network, KPMG India ensures compliance with local and international laws and regulations. It serves both national and international clients across diverse industries, offering industry-focused and technology-enabled solutions. The firm is committed to deliv...

Twin Health is hiring for a fresher entry level Remote Data Analyst role in India

Position: Data Analyst - Sales Company: Twin Health Location: Chennai, Tamil Nadu, India Job Type: Full-time Job Mode: Remote (source - https://www.linkedin.com/jobs/view/4134810117 ) Job Requisition ID: Not specified Years of Experience: Not specified; as per JD looks like a fresher entry level job; 0-3 years Company Description : Twin Health is a pioneering organization dedicated to transforming healthcare by addressing chronic metabolic diseases through advanced technology. The company developed The Whole Body Digital Twin™ , an innovative, AI-driven system that creates a dynamic representation of an individual's metabolism. This model utilizes thousands of real-time data points collected through non-invasive sensors and self-reported information to provide precise health insights. Twin Health’s core mission is to empower individuals to reverse, prevent, and manage chronic health conditions through personalized, data-driven decision-making. Employees at Twin Health are...

Boston Consulting Group (BCG) X Delivery is hiring for a fresher entry level Junior Data Analyst in India

Position: Junior Data Analyst Company: Boston Consulting Group (BCG) X Delivery Location: Gurgaon, India Bengaluru, India Job Type: Full-time Job Mode: On-site/Hybrid (as per company policies) Job Requisition ID: 51788 Years of Experience: Not explicitly mentioned; looks like a fresher entry level job as per the JD; 0-3 years Company Description : Boston Consulting Group (BCG) is a leading global management consulting firm that collaborates with influential business and societal leaders to address critical challenges and drive significant transformations. Established in 1963, BCG pioneered the field of business strategy and continues to be at the forefront of innovative consulting methodologies. The company integrates digital and human capabilities to help organizations scale, gain a competitive edge, and enhance profitability. With a diverse and global workforce, BCG combines deep industry expertise, strategic insight, and technological advancements to generate impactful solutions. Th...

Genpact is hiring for a fresher entry level Management Trainee - Data Scientist Consultant role in India

Position: Management Trainee - Data Scientist Consultant Company: Genpact (NYSE: G) Location: Gurugram, India Job Type: Full-time Job Mode: Onsite Job Requisition ID: BFS042283 Years of Experience: Entry-level (Relevant internship or project experience preferred) Company Description Genpact is a globally recognized professional services and solutions firm that drives transformative business outcomes. With a workforce of over 125,000 employees across more than 30 countries, Genpact fosters a culture of curiosity, innovation, and agility. The company specializes in offering digital operations services, leveraging deep industry knowledge, AI-driven solutions, and technological expertise. Genpact partners with leading enterprises, including Fortune Global 500 companies, to drive efficiency, productivity, and innovation. The core mission of Genpact is to continuously improve the way businesses function, enhancing customer experiences and operational effectiveness. Through AI, automation, da...

ExxonMobil is hiring for a fresher entry level BI Analyst (Data Analyst) role in India

Position: Business Intelligence Analyst (Data Analyst) Company: ExxonMobil Location: Bengaluru, KA, IN Job Type: Full-time Job Mode: Onsite Job Requisition ID: Not specified Years of Experience: Not specified; as per JD looks like a fresher entry level role; 0-3 years Company Description: ExxonMobil is a leading global energy and chemical company dedicated to advancing modern living and a net-zero future. As one of the largest publicly traded energy corporations, the company operates across Upstream, Product Solutions, and Low Carbon Solutions businesses. ExxonMobil takes pride in its diverse and talented workforce that continuously works to optimize energy solutions, chemicals, lubricants, and emission reduction technologies. The company is focused on delivering sustainable solutions to enhance the quality of life and address evolving societal needs. ExxonMobil encourages innovation and collaboration, inviting professionals to contribute to its mission of creating a sustainable and en...

Alcon is hiring for a fresher entry level Data Scientist role in India

Position: Data Scientist Company: Alcon Location: Bangalore, India Job Type: Full-time Job Mode: On-site Job Requisition ID: R-2025-38054 Years of Experience: Not specified; as per JD looks like a fresher entry level role; 0-3 years Company Description : Alcon is a global leader in the eye care industry, dedicated to improving vision and enhancing the quality of life for people around the world. The company focuses on research, innovation, and the development of high-quality products that cater to a wide range of eye care needs. Established in 1945, Alcon has grown into a multinational corporation with operations in over 180 countries, ensuring that people from all economic backgrounds have access to essential eye care solutions. Alcon specializes in both surgical and vision care products, offering advanced equipment for procedures like cataract removal and laser vision correction. The company also provides an array of consumer products, including the widely recognized Opti-Free line o...

Capgemini is hiring for a fresher entry level Machine Learning Engineer role in India

Position: Machine Learning Engineer - A Company: Capgemini Automotive Location: Hyderabad, India Job Type: Permanent Job Mode: Full-time, On-site Job Requisition ID: Ref. code 59661-en_US Years of Experience: 0-3 years Company Description: Capgemini Automotive is a leading global organization specializing in innovative solutions within the automotive industry. With a workforce exceeding 10,001 employees , the company has built a reputation for delivering cutting-edge technologies and services to drive the digital transformation of automotive businesses. The company plays a key role in revolutionizing the industry by leveraging emerging technologies, data-driven insights, and strategic collaborations. Capgemini Automotive focuses on key areas such as intelligent mobility, connected vehicles, electrification, and software-defined automotive solutions . With a mission to reshape the automotive landscape, the company partners with OEMs, suppliers, and technology firms to develop smart, sc...

Concentrix is hiring for a fresher entry level Data Analyst role in India

Position: Analyst, Forecasting Company: Concentrix Location: Bengaluru, Karnataka, India Job Type: Full-time Job Mode: On-site Job Requisition ID: R1528878 Years of Experience: Even relevant internship experience works Company Description Concentrix is a globally recognized leader in technology-enabled business services, specializing in customer engagement and enhancing business performance. The company partners with forward-thinking executives worldwide to provide innovative solutions that future-proof businesses against evolving market challenges and customer demands. With a strong emphasis on digital transformation, Concentrix delivers comprehensive end-to-end solutions that integrate advanced technologies, strategic insights, and industry expertise. The organization supports over 2,000 clients across various industries, helping them streamline operations, improve customer experience, and optimize business outcomes. Concentrix operates at scale, leveraging cutting-edge AI-powered an...

ISA is hiring for a fresher entry level Data Analytics Engineer role in India

Position: Data Analytics Engineer Company: Information Systems Associates - Sharjah (ISA-SHJ) Location: Bengaluru East, Karnataka, India Job Type: Full-time Job Mode: On-site Job Requisition ID: Not specified Years of Experience: Not specified; as per JD looks like a fresher entry level role; 0-3 years Company Description: Overview: Information Systems Associates (ISA) is a technology solutions provider specializing in the global travel and airline industry. Established in 2005, ISA has built a strong reputation for delivering high-performance IT solutions. Industry & Services: The company operates within the IT services and consulting domain, catering to airlines, travel agencies, and other related businesses. It offers a diverse portfolio of solutions that address multiple aspects of airline and travel management. Workforce & Presence: ISA employs between 201-500 professionals, demonstrating its mid-sized operational scale. The organization maintains an active presence on Lin...