Skip to main content

Sharpen your teeth in Data Analytics (and maybe create a portfolio while you’re at it)


 

As a novice data analyst myself, I truly understand the mind of another beginner rearing to go out in the world and trying to tame a wild data set and analysing the heck out of it. Understandable, it is then, the urge to pick up the same old Kaggle data set that is begging not to be analysed for the umpteenth time, all whilst arriving at the same insights as the 24,637 people that came before you. One must not be utterly taken aback then, when they see the face of their interviewer (seeing the same analysis thirteenth time that day) turn red in anger without application of any sort of conditional formatting.

So, one might naturally enquire into the remedy to such an ailment. I would then point you towards the less traversed yet endlessly fascinating direction of obscure publicly available datasets. From an exhaustive data set of all known passengers of the RMS Titanic to the largest reference data set of the human genome, not only do they make for remarkably interesting candidates for analytics projects, but they also set you apart in the eyes of interviewers. 

The variety of data out there is so diverse, every budding data enthusiast is bound to discover something that piques their analytical interest.


So here are some of my favourites that you can check out:
  1. Data.gov.in: Do your parents boast that they used to shop for a whole week’s worth of groceries all for 15 rupees in 2009? Bust out the last 15 years of Consumer Price index (CPI) data from the Government of India’s official data repository to prove them wrong, once and for all. Developed by the National Informatics Centre (NIC) under the aegis of Ministry of Electronics and Information Technology (aka MEITy), Data.gov.in has data from more than 6,00,000 resources including crime, judiciary, urban and sports. At this data heaven, everyone is sure to find something that would make a worthy addition to their portfolio. Not only this, but you can have access to a wide selection of their APIs as well. Bonus points for no sign ups necessary.

  2. Awesome public datasets (GitHub): From Swiss apartment models to the biggest crowdsourced database of American gut biome, ‘Awesome public datasets’ is a source of admittedly more global, yet no less amusing datasets which one can explore in search for their next project. These datasets were painstakingly collected and tidied from blogs and user responses. Most of them are absolutely free and part of the open source movement. Again, no obligation to sign up!

  3. Sindresorhus’s Awesome Collection (GitHub): This list, my friends! Is the GitHub equivalent to Sir Ravindra Jadeja because of the all-rounder variety of resources it holds. Not only is it home to learning resources ranging from fintech to Generative AI, it also holds free books, public datasets and much more. This list is a one stop shop to learn anything and everything! Do, however, make sure to have blinders on while you visit this page, otherwise you’re guaranteed to be distracted along the way (speaking from personal experience).

  4. Figshare: This one is for all the academically minded folks out there. Figshare has an endless trove of datasets from close to 25 categories ranging from economics to earth sciences. Be it China’s Covid-19 case data from January 2020 or the species of native plants in any state of the US, if you can think of it, this data repository probably has it. With a clean UX, it successfully distinguishes itself from the typical academic website, making it easier for a newbie to find his way around. The good news? You can download up to 20 GB of this data for FREE! (thank me later)

  5. Google Trends: Did you know that mentions of the term “big data” peaked in October 2018? Wanna know why? Then this is my homework for you to find out through Google’s own repository of all things trends and keywords. Alright! I must admit this one isn’t very “obscure” but deserves a mention, nonetheless. A pioneer in “nowcasting”, google trends is the back bone for all sorts of projects to get real time updates, the OECD’s weekly GDP tracker being a good example. Being the good Samaritans they are, they have an extremely helpful section right upfront to teach newbies how to make the most of this data as well.

  6. World Bank Open data: Wonder how the GDP of the nations of the world has changed over the past 30 years? No biggie, the World Bank has you covered! With it’s ‘World Bank Open Data’ initiative, it has made a true wealth of financial and fiscal data available to the masses. This data is available to download in CSV, XML and Excel formats along with access to their own data bank and thematic tables for easy understanding. All at one click of a button!

  7. OECD Data Explorer: In their own words, The Organisation for Economic Cooperation and Development (OECD) (phew) is an international organisation working towards making better policies for better lives. But they’re not all talk, they’ve made available data ranging from Tobacco consumption, Variation in Body weight between nationalities and Wildfires, to one and all. This excellent selection of data can help you analyse everything from levels of alcoholism between states in India to the variation of occurrence of obesity within the country. Truly a great way to spend one’s Saturday, don’t you think? (just kidding)

  8. UCI Machine learning repository: Focused towards machine learning enthusiasts, this is an excellent repository of more than 600 datasets for all the newbies trying to get themselves familiar with ML. This will ensure that you go from being an ML clueless to an ML connoisseur in no time!

So, I hope that equipped with these sources, you will make your portfolio stand out like a kangaroo in a penguin enclosure. Do always remember the wise words of Franklin D. Roosevelt, “The only thing we have to fear is fear itself, and maybe not backing up important data” (don’t quote me on that though).

Till we meet again, Data comrades!
 
About Author
Author Photo
Vasudev Pandey
I am a budding data scientist and mechatronics engineer with a passion for history and finance. I write about anything and everything I find interesting.

Comments

  1. Thank you for sharing these public datasets with the TakeOff Talent community, Vasudev. This will surely help many people who are confused around how to build a portfolio in analytics as well as in data science.

    ReplyDelete
  2. great article, thank you so much for sharing vasudev

    ReplyDelete
  3. Thanks Vasu...very helpful

    ReplyDelete
  4. This will surely help the DS community. - Dan

    ReplyDelete

Post a Comment

Please feel free to share your thoughts and discuss.

Other popular job openings

Walmart Global Tech is hiring for a fresher entry level Data Analyst role in India

Position: Title: Data Analyst II Department: Walmart Global Tech Team: Contact Center Business Excellence (CBE) Company: Name: Walmart Global Tech Parent Company: Walmart Inc. Industry: Retail and Technology Founded: 1962 Headquarters: Bentonville, Arkansas, USA Global Presence: Operates in 24 countries with over 10,500 stores globally Location: Office Address: 3rd Floor, Block B, Tecci Park, 173, Old Mahabalipuram Road, Sholinganallur, Chennai, India Work Mode: Home Office Setup at Tecci Park Job Type: Employment Type: Full-time Schedule: Rotational Shifts Department: Data Analytics and Customer Service Job Mode: Mode: Hybrid Job Requisition ID: Requisition ID: R-2114924 Years of Experience: Experience Required: Option 1: Bachelor’s degree in relevant fields; 0-3 years of experience Option 2: Minimum of 2 years of experience in data analysis, data science, statistics, or a related field ( in case you don'...

American Express is hiring for a fresher entry level Data Analyst role in India

Position: Business Analyst - Data Analytics Company: American Express Location: Gurugram, Haryana, India Job Type: Full-time Job Mode: Hybrid Job Requisition ID: 25005590 Years of Experience: Not explicitly mentioned; as per JD looks like a fresher entry level role; 0-3 years Company Description: American Express is a globally recognized financial services company known for its commitment to providing superior customer service and innovative financial solutions. The company operates in over 130 countries and serves millions of customers worldwide. It is dedicated to fostering a diverse and inclusive work environment where employees feel valued and empowered. American Express offers a wide range of financial products, including credit cards, travel services, and payment solutions, catering to both individuals and businesses. The company strongly emphasizes technological advancements and data-driven decision-making to enhance customer experiences and operational efficiency. With a rich l...

Brillio is hiring for a fresher entry level Data Scientist role in India

Position: Role: Data Scientist Primary Skills Required: Statistical Techniques: Hypothesis Testing, T-Test, Z-Test, Regression (Linear and Logistic) Programming and Tools: Python/PySpark, SAS/SPSS Statistical Analysis and Computing Probabilistic Graph Models Frameworks: TensorFlow, PyTorch, Sci-Kit Learn, CNTK, Keras, MXNet AI/ML Frameworks and Tools: Kubeflow, BentoML Forecasting Methods: Exponential Smoothing, ARIMA, ARIMAX Model Evaluation: Great Expectation, Evidently AI Classification Algorithms: Decision Trees, SVM Distance Metrics: Hamming Distance, Euclidean Distance, Manhattan Distance Statistical Programming: R/R Studio Company: Company Name: Brillio Industry: Data and AI – Data Science Employment Platform: Lever Location: City: Bangalore State: Karnataka Country: India Job Type: Employment Type: Full-Time Specialization: AI/ML Engineer – Data Science Advanced Job Mode: Mode: Hybrid Work Arrangement: Combin...

Travelers is hiring for an entry level Data Analyst role in US

Position: Data Analyst – Location Intelligence Company: Travelers Location: Hartford, CT Additional Locations: Saint Paul, MN Job Type: Full-time Job Mode: Hybrid Job Requisition ID: R-43578 Years of Experience: Not explicitly mentioned (Candidates with a Bachelor’s degree or at least one year of relevant experience are eligible) Company Description: Travelers is a distinguished property casualty insurance provider with a legacy spanning over 160 years. The company is dedicated to serving customers, communities, and employees by fostering innovation and collaboration. With a well-established reputation in the industry, Travelers offers a dynamic work environment that values teamwork and professional growth. Employees at Travelers experience a workplace culture built on integrity, inclusivity, and technological advancements, ensuring both career progression and personal development. Travelers provides a comprehensive compensation package that includes a competitive salary, performance-b...

Amadeus is hiring for a fresher entry level Data Scientist role in India

Position: Title: Data Scientist Company: Name: Amadeus Location: City: Bangalore Country: India Job Type: Type: Full-time Job Mode: Mode: Hybrid (Combination of on-site and remote work) Job Requisition ID: ID: R26757 Years of Experience: Experience: Not specified; as per CV, looks like a fresher entry level role; 0-3 years Company Description  About Amadeus: Amadeus is a global technology company that plays a pivotal role in revolutionizing the travel industry by connecting travel providers, agencies, and corporations through advanced software solutions. With a workforce of over 20,000 passionate professionals spread across 100+ global locations , Amadeus thrives on innovation and a deep commitment to enhancing travel experiences. The company’s solutions help airlines, hotels, travel agencies, and other partners deliver seamless services to millions of travelers worldwide. Amadeus leverages cutting-edge technologies such as artifici...

KPMG is hiring for an entry level Data Analyst role in US

Position: Data Analyst (Contract) Company: KPMG US Location: Seattle, Washington, United States Job Type: Contract Job Mode: Hybrid/Remote (as per project requirements) Job Requisition ID: Not specified in the original job description Years of Experience: Not mentioned explicitly; but looks like an entry level role from the JD Company Description: Global Professional Services Leader: KPMG US is one of the leading professional services firms globally, known for providing audit, tax, and advisory services. As part of the renowned Big Four accounting firms, KPMG operates across more than 75 offices in the United States, employing over 40,000 professionals who are dedicated to driving innovation and maintaining excellence in the field. Commitment to Career Growth and Learning: KPMG strongly emphasizes career development, offering robust formal education programs, leadership training, and mentorship opportunities. Professionals at KPMG benefit from its sta...

Accertify is hiring for a fresher entry level Data Scientist role in India

Position: Data Science Analyst (Data Scientist) Company: Accertify, Inc. Location: Gurgaon, Haryana, India Job Type: Full-time Job Mode: Hybrid (On-site and Remote) Job Requisition ID: Not provided Years of Experience: 0 to 3 years Company Description: Accertify, Inc., a globally recognized leader in digital fraud prevention, was founded in 2007 by bringing together top minds in engineering, analytics, and design. The company specializes in providing solutions to mitigate risks related to digital identity, fraud prevention, chargeback management, and payment gateway operations. With a customer-centric approach, Accertify ensures seamless protection for organizations against financial fraud while enhancing the user experience. Over the years, Accertify has built a reputation as a trusted platform that assesses risks throughout the customer journey—from account monitoring and payment risk assessment to refund fraud detection and dispute management. The company’s cutting-edge solutions no...

London Stock Exchange is hiring for a fresher entry level Junior Data Scientist role again in India

Position: Junior Data Scientist Company: London Stock Exchange Group (LSEG) Location: Bangalore, India (TowerE, RMZ Infin) Job Type: Full-time Job Mode: Hybrid Job Requisition ID: R0102107 Years of Experience: Not specified; looks like a fresher entry level role as per JD; 0-3 years Company Description LSEG (London Stock Exchange Group) is a premier global financial markets infrastructure provider, delivering essential data, technology, and analytics solutions. The organization plays a crucial role in financial stability, offering services that drive economic growth and empower businesses worldwide. With a workforce of over 25,000 employees spread across 65 countries, LSEG values innovation, collaboration, and sustainability. The company focuses on re-engineering financial ecosystems to promote green economy growth, support sustainable development, and enhance inclusive economic opportunities. At LSEG, diversity and inclusion are integral to the c...

CGI is hiring for an entry level Junior Data Scientist role in US

Position: Junior Data Scientist Company: CGI Location: United States, Virginia, Arlington Job type: Full-Time Job mode: Hybrid (Onsite in Arlington, VA, and work at Fairfax, VA office) Job requisition id: J1224-0174 Years of experience: Not explicitly mentioned, but a minimum of 1 year of experience in relevant areas is required. Company description: Founded in 1976, CGI is one of the world's largest IT and business consulting services firms, offering end-to-end services in IT, digital transformation, and business consulting. Headquartered in Montreal, Canada, CGI operates in over 40 countries with more than 90,000 professionals delivering insights-driven, technology-enabled solutions. CGI provides innovative solutions that accelerate business transformation and improve operational efficiency for clients across industries, including government, healthcare, finance, and retail. CGI has built a reputation for offering exceptional services while maintaining a strong culture of ownersh...

Yulu is hiring for a fresher entry level Data Scientist role in India

Position: Title: Data Scientist Company: Yulu Location: Bengaluru, Karnataka, India Job Type: Full-time Job Mode: On-site Job Requisition ID: Not mentioned Years of Experience: Not specified; as per JD looks like a fresher entry level role; 0-3 years Company Description: About Yulu: Yulu is India’s leading provider of shared electric mobility services, aiming to reduce urban traffic congestion and combat air pollution by offering small, smart, and eco-friendly electric vehicles. Established by a mission-driven team with a commitment to innovation and positive impact, Yulu has earned numerous prestigious awards for its contributions to sustainable urban mobility. The company focuses on two key areas: Providing affordable, last-mile commuting solutions for daily travelers. Enabling gig workers to deliver goods efficiently, thus contributing to lower carbon emissions. Yulu’s innovative solutions combine electric mobility with technology, ensuring a s...