new

Get trending papers in your email inbox!

Subscribe

byAK and the research community

Mar 12

Harnessing the Hubble Space Telescope Archives: A Catalogue of 21,926 Interacting Galaxies

Mergers play a complex role in galaxy formation and evolution. Continuing to improve our understanding of these systems require ever larger samples, which can be difficult (even impossible) to select from individual surveys. We use the new platform ESA Datalabs to assemble a catalogue of interacting galaxies from the Hubble Space Telescope science archives; this catalogue is larger than previously published catalogues by nearly an order of magnitude. In particular, we apply the Zoobot convolutional neural network directly to the entire public archive of HST F814W images and make probabilistic interaction predictions for 126 million sources from the Hubble Source Catalogue. We employ a combination of automated visual representation and visual analysis to identify a clean sample of 21,926 interacting galaxy systems, mostly with z < 1. Sixty five percent of these systems have no previous references in either the NASA Extragalactic Database or Simbad. In the process of removing contamination, we also discover many other objects of interest, such as gravitational lenses, edge-on protoplanetary disks, and `backlit' overlapping galaxies. We briefly investigate the basic properties of this sample, and we make our catalogue publicly available for use by the community. In addition to providing a new catalogue of scientifically interesting objects imaged by HST, this work also demonstrates the power of the ESA Datalabs tool to facilitate substantial archival analysis without placing a high computational or storage burden on the end user.

The Chandra Source Catalog

The Chandra Source Catalog (CSC) is a general purpose virtual X-ray astrophysics facility that provides access to a carefully selected set of generally useful quantities for individual X-ray sources, and is designed to satisfy the needs of a broad-based group of scientists, including those who may be less familiar with astronomical data analysis in the X-ray regime. The first release of the CSC includes information about 94,676 distinct X-ray sources detected in a subset of public ACIS imaging observations from roughly the first eight years of the Chandra mission. This release of the catalog includes point and compact sources with observed spatial extents <~ 30''. The catalog (1) provides access to the best estimates of the X-ray source properties for detected sources, with good scientific fidelity, and directly supports scientific analysis using the individual source data; (2) facilitates analysis of a wide range of statistical properties for classes of X-ray sources; and (3) provides efficient access to calibrated observational data and ancillary data products for individual X-ray sources, so that users can perform detailed further analysis using existing tools. The catalog includes real X-ray sources detected with flux estimates that are at least 3 times their estimated 1 sigma uncertainties in at least one energy band, while maintaining the number of spurious sources at a level of <~ 1 false source per field for a 100 ks observation. For each detected source, the CSC provides commonly tabulated quantities, including source position, extent, multi-band fluxes, hardness ratios, and variability statistics, derived from the observations in which the source is detected. In addition to these traditional catalog elements, for each X-ray source the CSC includes an extensive set of file-based data products that can be manipulated interactively.

pathfinder: A Semantic Framework for Literature Review and Knowledge Discovery in Astronomy

The exponential growth of astronomical literature poses significant challenges for researchers navigating and synthesizing general insights or even domain-specific knowledge. We present Pathfinder, a machine learning framework designed to enable literature review and knowledge discovery in astronomy, focusing on semantic searching with natural language instead of syntactic searches with keywords. Utilizing state-of-the-art large language models (LLMs) and a corpus of 350,000 peer-reviewed papers from the Astrophysics Data System (ADS), Pathfinder offers an innovative approach to scientific inquiry and literature exploration. Our framework couples advanced retrieval techniques with LLM-based synthesis to search astronomical literature by semantic context as a complement to currently existing methods that use keywords or citation graphs. It addresses complexities of jargon, named entities, and temporal aspects through time-based and citation-based weighting schemes. We demonstrate the tool's versatility through case studies, showcasing its application in various research scenarios. The system's performance is evaluated using custom benchmarks, including single-paper and multi-paper tasks. Beyond literature review, Pathfinder offers unique capabilities for reformatting answers in ways that are accessible to various audiences (e.g. in a different language or as simplified text), visualizing research landscapes, and tracking the impact of observatories and methodologies. This tool represents a significant advancement in applying AI to astronomical research, aiding researchers at all career stages in navigating modern astronomy literature.

Gaia Data Release 3: Summary of the content and survey properties

We present the third data release of the European Space Agency's Gaia mission, GDR3. The GDR3 catalogue is the outcome of the processing of raw data collected with the Gaia instruments during the first 34 months of the mission by the Gaia Data Processing and Analysis Consortium. The GDR3 catalogue contains the same source list, celestial positions, proper motions, parallaxes, and broad band photometry in the G, G_{BP}, and G_{RP} pass-bands already present in the Early Third Data Release. GDR3 introduces an impressive wealth of new data products. More than 33 million objects in the ranges G_{rvs} < 14 and 3100 <T_{eff} <14500 , have new determinations of their mean radial velocities based on data collected by Gaia. We provide G_{rvs} magnitudes for most sources with radial velocities, and a line broadening parameter is listed for a subset of these. Mean Gaia spectra are made available to the community. The GDR3 catalogue includes about 1 million mean spectra from the radial velocity spectrometer, and about 220 million low-resolution blue and red prism photometer BPRP mean spectra. The results of the analysis of epoch photometry are provided for some 10 million sources across 24 variability types. GDR3 includes astrophysical parameters and source class probabilities for about 470 million and 1500 million sources, respectively, including stars, galaxies, and quasars. Orbital elements and trend parameters are provided for some 800,000 astrometric, spectroscopic and eclipsing binaries. More than 150,000 Solar System objects, including new discoveries, with preliminary orbital solutions and individual epoch observations are part of this release. Reflectance spectra derived from the epoch BPRP spectral data are published for about 60\,000 asteroids. Finally, an additional data set is provided, namely the Gaia Andromeda Photometric Survey (abridged)

AstroMLab 1: Who Wins Astronomy Jeopardy!?

We present a comprehensive evaluation of proprietary and open-weights large language models using the first astronomy-specific benchmarking dataset. This dataset comprises 4,425 multiple-choice questions curated from the Annual Review of Astronomy and Astrophysics, covering a broad range of astrophysical topics. Our analysis examines model performance across various astronomical subfields and assesses response calibration, crucial for potential deployment in research environments. Claude-3.5-Sonnet outperforms competitors by up to 4.6 percentage points, achieving 85.0% accuracy. For proprietary models, we observed a universal reduction in cost every 3-to-12 months to achieve similar score in this particular astronomy benchmark. Open-source models have rapidly improved, with LLaMA-3-70b (80.6%) and Qwen-2-72b (77.7%) now competing with some of the best proprietary models. We identify performance variations across topics, with non-English-focused models generally struggling more in exoplanet-related fields, stellar astrophysics, and instrumentation related questions. These challenges likely stem from less abundant training data, limited historical context, and rapid recent developments in these areas. This pattern is observed across both open-weights and proprietary models, with regional dependencies evident, highlighting the impact of training data diversity on model performance in specialized scientific domains. Top-performing models demonstrate well-calibrated confidence, with correlations above 0.9 between confidence and correctness, though they tend to be slightly underconfident. The development for fast, low-cost inference of open-weights models presents new opportunities for affordable deployment in astronomy. The rapid progress observed suggests that LLM-driven research in astronomy may become feasible in the near future.

A catalog of ringed galaxies in the TNG50 simulation: Analysis of their properties and structure

The catalog of ringed galaxies was compiled through visual classification of synthetic images from the TNG50 simulation. Galaxies were selected based on specific criteria: a redshift range of 0.01 < z < 0.1, stellar mass M_star >10^9 M_odot, stellar half-mass radius r_{50} > 1 kpc, and specific star formation rate (sSFR), log(sSFR/yr^{-1}) > -13. Our classification allowed for differentiation between inner rings, outer rings, combinations of rings, and partial rings (pseudo-rings), including barred and non-barred ringed galaxies. We constructed a control sample of non-ringed galaxies with similar redshift, stellar mass, and environmental density distributions. We identified 807 ringed galaxies. Approximately 59% possess an inner ring, 22% a partial ring, 12% an outer ring, and 7% have i+o rings. Our statistical analysis reveals that 64% (507 galaxies) exhibit bars. Ringed galaxies exhibit lower efficiency for star formation, reduced gas fractions, redder colors, and higher metallicities compared to non-ringed disk objects. They also show greater variability in metallicity for a given stellar mass. From the analysis of radial profiles, galaxies with outer rings exhibit a r_{50} similar to or slightly larger than their control group, while those with inner or partial rings tend to have smaller sizes. A deeper exploration of radial density profiles revealed a pronounced central mass deficit preceding the ring structures, with inner and outer rings located at r_{50} and 1.5 , r_{50}, respectively. Galaxies with both i+o rings have inner rings that are more compact and massive. Additionally, galaxies with partial rings exhibit deeper mass profiles than their controls, particularly in central areas. These findings improve our understanding of galactic evolution and the complex interplay between mass distribution and morphology.

The FAST HI 21-cm absorption blind survey. II. -- Statistic Exploration for Associated and Intervening systems

We present an extragalactic HI 21-cm absorption lines catalog from a blind search at z leqslant 0.35, using drift-scan data collected in 1325.6 hours by the ongoing Commensal Radio Astronomy FasT Survey (CRAFTS) and FAST All Sky HI Survey (FASHI), which spans a sky area of 6072.0 deg^{2} and covers 84533 radio sources with a flux density greater than 12 mJy. 14 previously identified HI absorbers and 20 newly discovered HI absorbers were detected, comprising 15 associated systems, 10 intervening systems, and 9 systems with undetermined classifications. Through spectral stacking, the mean peak optical path, mean velocity-integrated optical path, mean FWHM and mean HI column density are measured to be 0.47 and 0.30; 27.19 and 4.36 km s^{-1}; 42.61 and 9.33 km s^{-1}; 0.49 and 0.08 T_{s} times 10^{20}cm^{-2}K^{-1}, for the associated and intervening samples, respectively. Statistical analysis also reveals that associated systems tend to be hosted by red (g-r>0.7) galaxies at lower redshifts, whereas galaxies hosting intervening HI absorption are typically found at higher redshifts and are of a bluer (g-rleqslant0.7) type. A noticeable difference is observed in the positions of foregrounds, backgrounds of intervening systems, and high-redshift and low-redshift associated systems on the WISE color-color diagram. All identified foreground sources in our sample have W1-W2 magnitudes below 0.8, suggesting no Active Galactic Nuclei (AGN). In contrast, backgrounds of intervening systems tend to have W1-W2 magnitudes above 0.8, indicating AGN presence. For associated absorption, most low-redshift (zleqslant0.5) systems show W1-W2 values below 0.8, while higher-redshift associated absorption (z>0.5) displays a broader range of W1-W2 values.