What We Do
Big Data Science is utilized everywhere, from the world of athletics to make play predictions by the opposing team during games, to Pandora Radio to categorize and play new songs the listener is likely to enjoy, to sequencing the Human Genome. Data has grown at a rapid rate over the last few years, further escalated as social media and smartphone use has increased exponentially. As a consequence, there is now a major shortage of analytic scholars to fulfill this rapidly growing need for data analytic positions. According to the U.S. Bureau of Labor Statistics, the national average growth rate for jobs is 5-8% yearly. However, a variety of occupations that require big data science are growing at a rate much faster than average, such as statisticians, with a growth rate of 34% per year. With the immense growth and utilization of big data in virtually all fields, it is becoming imperative that a wide variety of students become educated about how to visualize, analyze and draw conclusions from analyses of enormous sets of data. The DataJam was formed to develop new strategies for engaging high school students in the field of big data and encouraging them to obtain further education to participate fully in a data-driven world! The DataJam now offers programming for high school students “The Annual High School DataJam”, community college students “The Biannual Community College DataJam”, and middle school students “Middle School DataJam Days”.
The DataJam is a nonprofit organization that was formed by software scientists from IBM, Oracle, Teradata, and Iqvia and faculty from the University of Pittsburgh, Carnegie Mellon University and the joint Pitt-CMU Supercomputer Center to work together to develop and run the DataJam to inspire high school youth to enter the field of big data science. The central goal was to familiarize youth with the field of big data science and make them more aware of the role of big data in their everyday lives. As many companies and businesses rely heavily on big data and are helping shape the future using big data, the underlying concept was to build a pipeline of young people interested in and trained to participate in the big data science of the future. The DataJam started running a competition for teams from local high schools in the 2013-2014 school year and it has expanded every year, and now teams from across the country participate.
Ten Years of Introducing Youth to the Power of Data Science (2014-2024)
August 14, 2024
Read about the impact The DataJam has had on individuals who have participated (students, teachers, mentors), communities (schools, school districts, colleges & universities, business partners), and future data scientists! Track the national and international expansion of The DataJam, and learn about the ongoing activities that are strengthening The DataJam Community. The DataJam is impacting Access to Digital Skills for a wide diversity of youth. Learn how we hope to extend our mission “To grow communities of learners who use data, analysis, and critical thinking skills to better understand and impact the world around us” in the next ten years!
The High School DataJam
The High School DataJam is an academic competition for high school students and afterschool programs, like the Boys & Girls Clubs, which focuses on teaching about the use of big data to answer a research question. The program is set up in such a way that students usually work in teams of 3-8 students to formulate a research question, find publicly available data sets, analyze their data, make data visualizations, and present their findings to a panel of judges. Students learn skills pertaining to the scientific method, data analysis, and how to give scientific presentations. Schools can have multiple DataJam teams if they choose to.
The Fall of each academic year is a great time for teams to start forming and thinking about participating in the High School DataJam. A good way to get started is to view one of the two introductory videos “Introduction to the DataJam” or “DataJam Mentor Overview to the DataJam” that are available at the top of the DataJam page at pghdataworks.org. It is also very helpful for teams to view the video “A Walk Through The DataJam Website” that can be accessed both at the top of the DataJam page or the top of the Home Page. If teams or teachers have questions about the logistics of participating in the High School DataJam, just email datajam@thedatajam.org and ask for a zoom conference and we will meet with you and answer any questions you have.
New teams may want to join the online Slack workspace for the High School DataJam. This is a workspace on the web where teams have their own channel and can work on their project collaboratively when they are not all together. They can also message their High School DataJam mentor on the Slack workspace and get assistance when they need it. If your team wants to join the Slack workspace, each team member needs to have their parent fill out and sign a permission slip (found in the DataJam Guide Book that can be downloaded from the DataJam page). Email the permission slips to datajam@thedatajam.org.
During December and January teams need to submit a High School DataJam proposal that includes their research question, their hypothesis, and the datasets they plan to use to address their question. A template for the High School DataJam proposal is found in the High School DataJam Guide Book.
High School DataJam teams then work on analyzing their data until late March, at which point they turn in a poster describing the findings of their analysis and they give an oral presentation to a panel of judges. All teams have the opportunity to display posters of their project at a High School DataJam finale in late April. The week before the finale each team presents their project in an oral presentation on Zoom to a panel of High School DataJam judges. Awards are given for 1st, 2nd, and 3rd place, as well as for the Best New Team, and Best Presentation. All students receive a certificate of participation and a participation prize.
Community College DataJam
The Community College DataJam is a semester-long data science activity and competition, offered annually in both the Fall (Aug-Dec) & Spring (Jan-May) semesters to introduce, encourage and engage college students to learn about data science with a focus in any subject area they are interested in. DataJam’s goal is to help students successfully engage with the ways data is used everywhere to solve problems, and better understand social, economic and environmental aspects of our world. As with the High School DataJam, teams are able to choose their own topic to study and all teams are paired with a DataJam mentor, usually a university student, to provide individualized assistance to their team. Importantly, for college students their DataJam poster and presentation can be included in their college portfolio and provide compelling examples of a hands-on learning project they have been involved in that will be of interest to potential employers.
Middle School DataJam Days
Middle School DataJam Days are designed to introduce middle school students to the power of analyzing data to find answers to questions. They are half day workshops, run in person, in areas of the country where DataJam mentors are trained. Students work in groups with a mentor leading a session on a topic chosen to engage middle school students, such as “UFO Sightings”, “Shark Attack Locations”. The mentors come prepared to the workshop with data sets appropriate for addressing questions related to the topic. Students bring laptop computers to the workshop. Each group works together using google sheets and docs. Mentors guide students through developing specific research questions, developing hypotheses, choosing what analyses to do, guiding students through doing analyses and making data visualizations. At the end of the workshop each group gives a short presentation about their research findings.
Participating Middle Schools, High Schools, After School Programs & Community Colleges
Teams from thirty five schools and afterschool programs have participated in the High School DataJam over the past eight years. Click a logo to learn more about a particular school or afterschool program.
The Board of Directors
The Board of Directors of The DataJam plays a crucial role in overseeing the organization and management of the annual DataJam competition. This team of eight members, including Beth Bauer, Cheryl Begandy, Judy Cameron, Catherine Cramer, Brian Macdonald, Devashish Saxena, Beth Schwanke, and Raja Sooriamurthi, is responsible for designing and continually updating the DataJam and its accompanying resources to ensure successful project implementation by participating teams. They are also involved in expanding the reach of the DataJam and the training of DataJam Mentors on a national scale, aiming to engage diverse communities across the country.
How Data Professionals Can Get Involved
At the DataJam, industry professionals play a crucial role in inspiring and guiding the next generation of data scientists. Your support directly impacts the educational experiences of high school and community college students, fostering their interest in data science and preparing them for future careers. By supporting DataJam, you engage with a diverse community of educators, students, and industry professionals, fostering collaboration and knowledge sharing. Your contribution fuels innovation in data science education, empowering students to tackle complex challenges and develop creative solutions using big data analytics. There are several ways you can support DataJam and make a meaningful impact!
Judge the Competition
Become a judge for DataJam competitions and help evaluate projects developed by high school and community college students. Your expertise and insights will contribute to recognizing and rewarding innovative solutions and exceptional analytical skills among young participants.
Host a Field Trip
Open your doors and host a field trip at your place of business. Show students firsthand how big data is applied in real-world settings. This immersive experience can spark curiosity, inspire learning, and provide valuable exposure to industry practices. We like to offer field trips for all teams winning DataJam awards after the DataJam Finale each year. For these special field trips we like to arrange for the teams to present their 10-minute final presentation to the industry professionals.
2023 Google Visit North Allegheny Team 1
Donate and Pledge your Financial Support
Consider making a monetary donation to DataJam or pledging a yearly contribution. Your financial support enables us to continue organizing competitions, providing resources to participants, and expanding educational initiatives in big data science.
If you're interested in supporting DataJam through donations of money or time, please reach out to us at datajam@thedatajam.org. We welcome contributions at all levels and appreciate your commitment to empowering the next generation of data-driven innovators.
Join us in shaping a brighter future through data science education. Together, we can inspire, educate, and empower young minds to excel in the dynamic world of big data.
Partners and Sponsors
PPG and the PPG Foundation
A Pittsburgh company since 1883, PPG is a global supplier of paints, coatings, optical products, and specialty materials. Through leadership in innovation, sustainability and color, PPG helps customers in industrial, transportation, consumer products, and construction markets and aftermarkets to enhance more surfaces in more ways than does any other company. Like many companies PPG is using data science to fuel a digital transformation, providing inspiration and a realistic view of careers for the DataJam students. The PPG Foundation provided funding in 2019-20, helping the DataJam to shift to a virtual format in response to COVID-19.
NorthEast Big Data Innovation Hub
The DataJam is proud to be a collaborator of the Northeast Big Data Innovation Hub (Northeast Hub). The mission of the Northeast Hub is to build and strengthen partnerships across industry, academia, nonprofits, and government to address societal and scientific challenges, spur economic development, and accelerate innovation in the national big data ecosystem. The Northeast Hub is a community convener, collaboration hub, and catalyst for data science innovation in the Northeast Region. The Hub amplifies successes of the community, and shares credit across the community to encourage collaboration and mutual success in data science endeavors. The Northeast Hub region includes the states of Connecticut, Maine, Massachusetts, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island and Vermont.
The Math & Science Collaborative
The MSC is a program of the Allegheny Intermediate Unit that focuses on STEM education. The MSC brings innovative and effective approaches in curriculum and instruction to the region, preparing educators to support all students for work and career in the 21st century. It was formed in 1994 by a regional “congress” of stakeholders and reaches out to more than 135 public and non-public schools/districts in Allegheny, Armstrong, Beaver, Butler, Fayette, Greene, Indiana, Lawrence, Mercer, Washington, and Westmoreland counties. The MSC helped promote the DataJam to its member school districts this past year and identified joint opportunities to work together in the future.
Pitt Cyber Institute
The University of Pittsburgh Institute for Cyber Law, Policy, and Security provides a unique interdisciplinary environment for tackling cyber challenges. They bring the breadth of one of the world’s leading public research universities to bear on the critical questions of networks, data, and algorithms, with a focus on the ever-changing gaps among law, policy, and technology. Their collective of legal, policy, and technical researchers engages with policymakers and industry to create both actionable proposals to address current demands and fundamental insights to understand the future as it arrives. DataJam is teaming with Pitt Cyber to bring issues of data ethics and security into the program for the students to consider as part of their research.
The DataJam Board of Directors is supported by their home institutions, all of which have major education, research and development programs in Big Data, Data Analytics, and Data Science:
University of Pittsburgh
The University of Pittsburgh is a state-related research university. Founded in 1787, Pitt is one of the oldest institutions of higher education in the United States. Pitt people have defeated polio, unlocked the secrets of DNA, lead the world in organ transplantation, and pioneered TV and heavier-than-air flight, among numerous other accomplishments.
Carnegie Mellon University
Carnegie Mellon University is a private, global research university, Carnegie Mellon stands among the world's most renowned educational institutions, and sets its own course with cutting-edge brain science, path-breaking performances, innovative start-ups, driverless cars, big data, big ambitions, Nobel and Turing prizes, hands-on learning, and a whole lot of robots.
Pittsburgh Supercomputing Center
Pittsburgh Supercomputing Center is a joint partnership with Carnegie Mellon University and the University of Pittsburgh. Established in 1986,
PSC advances the state of the art in high-performance computing, communications and data analytics and offers a flexible environment for solving the largest and most challenging problems in data and computational science to scientists and engineers nationwide for unclassified research.
San Diego Supercomputer Center (SDSC)
The San Diego Supercomputer Center (SDSC) at UC San Diego is a leader in high-performance and data-intensive computing and cyberinfrastructure. Cyberinfrastructure refers to an accessible, integrated network of computer-based resources and expertise, focused on accelerating scientific inquiry and discovery.
SDSC provides resources, services and expertise to the local, regional, and national research community, including industry and academia. It supports hundreds of multidisciplinary programs spanning a wide variety of domains. SDSC was founded in 1985 with a $170 million grant from the NSF Supercomputer Centers program.
SDSC's history includes pioneering advances in data storage and cloud computing, from which have emerged several Centers of Excellence in the areas of large-scale data management, predictive analytics, health IT services, workflow automation and internet analysis.
PosiROI
With 30+ years working to identify, qualify and mine data, analytics and insights across the pharmaceutical and healthcare landscapes, we've been creating both small and scaled value solutions for our customers before most people knew the term Big Data. We work with our customers to understand the complexities and challenges they are facing. Then we work together to develop the business outcome-aligned data, analytic and technology strategies needed to accelerate and enable value today, while building the foundations and modern curation needed for sustainability.
We work with you to reveal win-win 360° customer value stories using your fit-for-purpose data, at scale. If you are not able to execute and quantify business impact, then all this data, research and insights, and all those data science dollars – appear as cost, instead of value.
We are collaborating with the DataJam to help with expansion in data education for all, and share the variety of exciting consultative data careers that make an impact on our world and our future.
Woods Hole Institute
The Woods Hole Institute (WHI) is a 501(c)3 non-profit that helps connect people and ideas among disciplines through a wide range of experiences such as colloquia, seminars, retreats, workshops, performances, and installations. WHI focuses on a range of topics including:
Complexity: Addressing the daunting problems we now face as a species requires valuing complexity as a framework for addressing humanity’s wicked problems.
Convergence: Addressing 21st century problems will take the collective minds of all of us. Converging the disciplines of science and valuing and working together across the social and physical sciences as well as the arts and humanities is going to be how we create a future for all of us.
Sustainability and Resilience: In order to create communities that are adaptive to extreme environmental events and serve all its members in ways that can function in perpetuity, we must rethink our relationship to the places we live and work. This will only happen with deep engagement, understanding what matters, knowing what it means to be part of a dynamic system, and working together to create healthy and equitable ways of living.
Emergence: With even the best minds and tools of science and engineering, we can’t always predict what happens next. But we can be prepared to expect the unexpected. With the rapidly changing climate, stresses on food systems, and increasing destruction of wildlands, we have to anticipate the next superstorm, the next pandemic, or the next dramatic change in climate and be willing to have plans to respond to something big.
West Big Data Innovation Hub (WBDIH)
The West Big Data Innovation Hub is an inclusive community for catalyzing and scaling data science for societal needs. Our mission is to build and strengthen partnerships across academia, industry, nonprofits, and government—connecting research, education, and practice to harness the data revolution.
With a focus on thematic ‘verticals’ such as metro/urban data science, and natural resource management, especially water, as well as cross-cutting ‘horizontals’ such as open science, workforce development, and data ethics, the West Hub enables creative cross-pollination and resource-sharing.
Fueled by outcomes-focused partnerships, the West Hub facilitates the development of collaborative pilot projects addressing regional needs, while connecting and scaling efforts as part of a larger global network. The WBDIH connects, convenes, curates, and communicates across our network with an emphasis on enabling interoperable, scalable, and sustainable solutions.