Members Can Post Anonymously On This Site
AI-powered satellite data reveals clouds in 3D
-
Similar Topics
-
By NASA
6 min read
Smarter Searching: NASA AI Makes Science Data Easier to Find
Image snapshot taken from NASA Worldview of NASA’s Global Precipitation Measurement (GPM) mission on March 15, 2025 showing heavy rain across the southeastern U.S. with an overlay of the GCMD Keyword Recommender for Earth Science, Atmosphere, Precipitation, Droplet Size. NASA Worldview Imagine shopping for a new pair of running shoes online. If each seller described them differently—one calling them “sneakers,” another “trainers,” and someone else “footwear for exercise”—you’d quickly feel lost in a sea of mismatched terminology. Fortunately, most online stores use standardized categories and filters, so you can click through a simple path: Women’s > Shoes > Running Shoes—and quickly find what you need.
Now, scale that problem to scientific research. Instead of sneakers, think “aerosol optical depth” or “sea surface temperature.” Instead of a handful of retailers, it is thousands of researchers, instruments, and data providers. Without a common language for describing data, finding relevant Earth science datasets would be like trying to locate a needle in a haystack, blindfolded.
That’s why NASA created the Global Change Master Directory (GCMD), a standardized vocabulary that helps scientists tag their datasets in a consistent and searchable way. But as science evolves, so does the challenge of keeping metadata organized and discoverable.
To meet that challenge, NASA’s Office of Data Science and Informatics (ODSI) at the agency’s Marshall Space Flight Center (MSFC) in Huntsville, Alabama, developed the GCMD Keyword Recommender (GKR): a smart tool designed to help data providers and curators assign the right keywords, automatically.
Smarter Tagging, Accelerated Discovery
The upgraded GKR model isn’t just a technical improvement; it’s a leap forward in how we organize and access scientific knowledge. By automatically recommending precise, standardized keywords, the model reduces the burden on human curators while ensuring metadata quality remains high. This makes it easier for researchers, students, and the public to find exactly the datasets they need.
It also sets the stage for broader applications. The techniques used in GKR, like applying focal loss to rare-label classification problems and adapting pre-trained transformers to specialized domains, can benefit fields well beyond Earth science.
Metadata Matchmaker
The newly upgraded GKR model tackles a massive challenge in information science known as extreme multi-label classification. That’s a mouthful, but the concept is straightforward: Instead of predicting just one label, the model must choose many, sometimes dozens, from a set of thousands. Each dataset may need to be tagged with multiple, nuanced descriptors pulled from a controlled vocabulary.
Think of it like trying to identify all the animals in a photograph. If there’s just a dog, it’s easy. But if there’s a dog, a bird, a raccoon hiding behind a bush, and a unicorn that only shows up in 0.1% of your training photos, the task becomes far more difficult. That’s what GKR is up against: tagging complex datasets with precision, even when examples of some keywords are scarce.
And the problem is only growing. The new version of GKR now considers more than 3,200 keywords, up from about 430 in its earlier iteration. That’s a sevenfold increase in vocabulary complexity, and a major leap in what the model needs to learn and predict.
To handle this scale, the GKR team didn’t just add more data; they built a more capable model from the ground up. At the heart of the upgrade is INDUS, an advanced language model trained on a staggering 66 billion words drawn from scientific literature across disciplines—Earth science, biological sciences, astronomy, and more.
NASA ODSI’s GCMD Keyword Recommender AI model automatically tags scientific datasets with the help of INDUS, a large language model trained on NASA scientific publications across the disciplines of astrophysics, biological and physical sciences, Earth science, heliophysics, and planetary science. NASA “We’re at the frontier of cutting-edge artificial intelligence and machine learning for science,” said Sajil Awale, a member of the NASA ODSI AI team at MSFC. “This problem domain is interesting, and challenging, because it’s an extreme classification problem where the model needs to differentiate even very similar keywords/tags based on small variations of context. It’s exciting to see how we have leveraged INDUS to build this GKR model because it is designed and trained for scientific domains. There are opportunities to improve INDUS for future uses.”
This means that the new GKR isn’t just guessing based on word similarities; it understands the context in which keywords appear. It’s the difference between a model knowing that “precipitation” might relate to weather versus recognizing when it means a climate variable in satellite data.
And while the older model was trained on only 2,000 metadata records, the new version had access to a much richer dataset of more than 43,000 records from NASA’s Common Metadata Repository. That increased exposure helps the model make more accurate predictions.
The Common Metadata Repository is the backend behind the following data search and discovery services:
Earthdata Search International Data Network Learning to Love Rare Words
One of the biggest hurdles in a task like this is class imbalance. Some keywords appear frequently; others might show up just a handful of times. Traditional machine learning approaches, like cross-entropy loss, which was used initially to train the model, tend to favor the easy, common labels, and neglect the rare ones.
To solve this, NASA’s team turned to focal loss, a strategy that reduces the model’s attention to obvious examples and shifts focus toward the harder, underrepresented cases.
The result? A model that performs better across the board, especially on the keywords that matter most to specialists searching for niche datasets.
From Metadata to Mission
Ultimately, science depends not only on collecting data, but on making that data usable and discoverable. The updated GKR tool is a quiet but critical part of that mission. By bringing powerful AI to the task of metadata tagging, it helps ensure that the flood of Earth observation data pouring in from satellites and instruments around the globe doesn’t get lost in translation.
In a world awash with data, tools like GKR help researchers find the signal in the noise and turn information into insight.
Beyond powering GKR, the INDUS large language model is also enabling innovation across other NASA SMD projects. For example, INDUS supports the Science Discovery Engine by helping automate metadata curation and improving the relevancy ranking of search results.The diverse applications reflect INDUS’s growing role as a foundational AI capability for SMD.
The INDUS large language model is funded by the Office of the Chief Science Data Officer within NASA’s Science Mission Directorate at NASA Headquarters in Washington. The Office of the Chief Science Data Officer advances scientific discovery through innovative applications and partnerships in data science, advanced analytics, and artificial intelligence.
Share
Details
Last Updated Jul 09, 2025 Related Terms
Science & Research Artificial Intelligence (AI) Explore More
2 min read Polar Tourists Give Positive Reviews to NASA Citizen Science in Antarctica
Article
6 hours ago
2 min read Hubble Observations Give “Missing” Globular Cluster Time to Shine
Article
6 days ago
5 min read How NASA’s SPHEREx Mission Will Share Its All-Sky Map With the World
Article
7 days ago
Keep Exploring Discover Related Topics
Missions
Humans in Space
Climate Change
Solar System
View the full article
-
By European Space Agency
The Meteosat Third Generation Sounder (MTG-S1) satellite, which is hosting the instrument for the Copernicus Sentinel-4 mission, has been placed inside the nose cone of the Falcon 9 launch rocket and is ready for the scheduled liftoff at 23:03 CEST on Tuesday, 1 July.
View the full article
-
By European Space Agency
At ESA’s Living Planet Symposium, scientist have unveiled how the combination of different long-term, high-resolution satellite datasets from ESA’s Climate Change Initiative is shedding new light on the South American Gran Chaco – one of the world’s most endangered dry forest ecosystems. These data reveal, in remarkable clarity, that fire is the primary driver of widespread, accelerating deforestation across the region.
View the full article
-
By European Space Agency
Today, at the Living Planet Symposium, ESA revealed the first stunning images from its groundbreaking Biomass satellite mission – marking a major leap forward in our ability to understand how Earth’s forests are changing and exactly how they contribute to the global carbon cycle. But these inaugural glimpses go beyond forests. Remarkably, the satellite is already showing potential to unlock new insights into some of Earth’s most extreme environments.
View the full article
-
By NASA
5 min read
NASA Launching Rockets Into Radio-Disrupting Clouds
NASA is launching rockets from a remote Pacific island to study mysterious, high-altitude cloud-like structures that can disrupt critical communication systems. The mission, called Sporadic-E ElectroDynamics, or SEED, opens its three-week launch window from Kwajalein Atoll in the Marshall Islands on Friday, June 13.
The atmospheric features SEED is studying are known as Sporadic-E layers, and they create a host of problems for radio communications. When they are present, air traffic controllers and marine radio users may pick up signals from unusually distant regions, mistaking them for nearby sources. Military operators using radar to see beyond the horizon may detect false targets — nicknamed “ghosts” — or receive garbled signals that are tricky to decipher. Sporadic-E layers are constantly forming, moving, and dissipating, so these disruptions can be difficult to anticipate.
An animated illustration depicts Sporadic-E layers forming in the lower portions of the ionosphere, causing radio signals to reflect back to Earth before reaching higher layers of the ionosphere. NASA’s Goddard Space Flight Center/Conceptual Image Lab Sporadic-E layers form in the ionosphere, a layer of Earth’s atmosphere that stretches from about 40 to 600 miles (60 to 1,000 kilometers) above sea level. Home to the International Space Station and most Earth-orbiting satellites, the ionosphere is also where we see the greatest impacts of space weather. Primarily driven by the Sun, space weather causes myriad problems for our communications with satellites and between ground systems. A better understanding of the ionosphere is key to keeping critical infrastructure running smoothly.
The ionosphere is named for the charged particles, or ions, that reside there. Some of these ions come from meteors, which burn up in the atmosphere and leave traces of ionized iron, magnesium, calcium, sodium, and potassium suspended in the sky. These “heavy metals” are more massive than the ionosphere’s typical residents and tend to sink to lower altitudes, below 90 miles (140 kilometers). Occasionally, they clump together to create dense clusters known as Sporadic-E layers.
The Perseids meteor shower peaks in mid-August. Meteors like these can deposit metals into Earth’s ionosphere that can help create cloud-like structures called Sporadic-E layers. NASA/Preston Dyches “These Sporadic-E layers are not visible to naked eye, and can only be seen by radars. In the radar plots, some layers appear like patchy and puffy clouds, while others spread out, similar to an overcast sky, which we call blanketing Sporadic-E layer” said Aroh Barjatya, the SEED mission’s principal investigator and a professor of engineering physics at Embry-Riddle Aeronautical University in Daytona Beach, Florida. The SEED team includes scientists from Embry-Riddle, Boston College in Massachusetts, and Clemson University in South Carolina.
“There’s a lot of interest in predicting these layers and understanding their dynamics because of how they interfere with communications,” Barjatya said.
A Mystery at the Equator
Scientists can explain Sporadic-E layers when they form at midlatitudes but not when they appear close to Earth’s equator — such as near Kwajalein Atoll, where the SEED mission will launch.
In the Northern and Southern Hemispheres, Sporadic-E layers can be thought of as particle traffic jams.
Think of ions in the atmosphere as miniature cars traveling single file in lanes defined by Earth’s magnetic field lines. These lanes connect Earth end to end — emerging near the South Pole, bowing around the equator, and plunging back into the North Pole.
A conceptual animation shows Earth’s magnetic field. The blue lines radiating from Earth represent the magnetic field lines that charged particles travel along. NASA’s Goddard Space Flight Center/Conceptual Image Lab At Earth’s midlatitudes, the field lines angle toward the ground, descending through atmospheric layers with varying wind speeds and directions. As the ions pass through these layers, they experience wind shear — turbulent gusts that cause their orderly line to clump together. These particle pileups form Sporadic-E layers.
But near the magnetic equator, this explanation doesn’t work. There, Earth’s magnetic field lines run parallel to the surface and do not intersect atmospheric layers with differing winds, so Sporadic-E layers shouldn’t form. Yet, they do — though less frequently.
“We’re launching from the closest place NASA can to the magnetic equator,” Barjatya said, “to study the physics that existing theory doesn’t fully explain.”
Taking to the Skies
To investigate, Barjatya developed SEED to study low-latitude Sporadic-E layers from the inside. The mission relies on sounding rockets — uncrewed suborbital spacecraft carrying scientific instruments. Their flights last only a few minutes but can be launched precisely at fleeting targets.
Beginning the night of June 13, Barjatya and his team will monitor ALTAIR (ARPA Long-Range Tracking and Instrumentation Radar), a high-powered, ground-based radar system at the launch site, for signs of developing Sporadic-E layers. When conditions are right, Barjatya will give the launch command. A few minutes later, the rocket will be in flight.
The SEED science team and mission management team in front of the ARPA Long-Range Tracking and Instrumentation Radar (ALTAIR). The SEED team will use ALTAIR to monitor the ionosphere for signs of Sporadic-E layers and time the launch. U.S. Army Space and Missile Defense Command On ascent, the rocket will release colorful vapor tracers. Ground-based cameras will track the tracers to measure wind patterns in three dimensions. Once inside the Sporadic-E layer, the rocket will deploy four subpayloads — miniature detectors that will measure particle density and magnetic field strength at multiple points. The data will be transmitted back to the ground as the rocket descends.
On another night during the launch window, the team will launch a second, nearly identical rocket to collect additional data under potentially different conditions.
Barjatya and his team will use the data to improve computer models of the ionosphere, aiming to explain how Sporadic-E layers form so close to the equator.
“Sporadic-E layers are part of a much larger, more complicated physical system that is home to space-based assets we rely on every day,” Barjatya said. “This launch gets us closer to understanding another key piece of Earth’s interface to space.”
By Miles Hatfield
NASA’s Goddard Space Flight Center, Greenbelt, Md.
Share
Details
Last Updated Jun 12, 2025 Related Terms
Heliophysics Goddard Space Flight Center Heliophysics Division Ionosphere Missions NASA Centers & Facilities NASA Directorates Science & Research Science Mission Directorate Sounding Rockets Sounding Rockets Program The Solar System The Sun Uncategorized Wallops Flight Facility Weather and Atmospheric Dynamics Explore More
9 min read The Earth Observer Editor’s Corner: April–June 2025
Article
22 hours ago
5 min read NASA’s Webb ‘UNCOVERs’ Galaxy Population Driving Cosmic Renovation
Article
22 hours ago
6 min read Frigid Exoplanet in Strange Orbit Imaged by NASA’s Webb
Article
2 days ago
Keep Exploring Discover Related Topics
Sounding Rockets
Ionosphere, Thermosphere & Mesosphere
Space Weather
Solar flares, coronal mass ejections, solar particle events, and the solar wind form the recipe space weather that affects life…
Solar System
View the full article
-
-
Check out these Videos
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.