Search Fedora Package Repositories Using Sourcegraph

Roseline Bassey, Fedora community

As our collection of open source packages grows, there is a need for ways to make searching packages within our dist-git repository much easier. In 2022, the people at Sourcegraph teamed up with our Fedora community to integrate their advanced free Code Search engine into our massive distribution package repositories, which now includes over 38,000 packages. With Sourcegraph’s Code Search, our developers, contributors, and maintainers can search our dist-git repository for specific RPM specfiles, module and container definitions, Fedora-specific patches, tests, and many more, all in one place, ultimately reducing the time spent searching for files.

What is Sourcegraph?

Sourcegraph is a code intelligence platform. You can think of it as a search engine for code. It can be used to search code across all code hosts, repositories, and branches. Sourcegraph has many great features, such as code intelligence, code insights, batch changes, and Cody - an AI Coding Assistant, but at its core, it is a Code Search tool. In this article, we will explore how to use Code Search for src.fedoraproject.org repositories, also known as Fedora dist-git.

Code search is a powerful search capability in Sourcegraph for searching code from a single interface. It supports search filtering by file type, repository, and language. This helps in refining search results.

Sourcegraph provides both a web app and CLI interface. When using the Sourcegraph web app, you will need to start each search with: repo:^src.fedoraproject.org before entering any search queries.

Sourcegraph interface
Obrázek 1. Sourcegraph code search interface

Filter Search: Using file Keyword to Find Specfiles

The file keyword returns results from files that match the specified file path. The following query searches all repositories for files ending in .spec that contain the term dnf5. The use of the file keyword simplifies the task of locating specfiles.

repo:^src\.fedoraproject\.org/ file:\.spec$ dnf5
find specfiles using the files keyword
Obrázek 2. Search for specfiles

Use the lang Filter to Find a Fedora Repository to Contribute to

The lang keyword is used to filter search results by programming language.

The following query searches our dist-git repositories for files written in Markdown with instances of the term contributing. This is great for people seeking to assist with projects in need of contributions.

repo:^src\.fedoraproject\.org/ lang:markdown contributing
find projects to contribute to using the lang keyword
Obrázek 3. Search for files written in Markdown

Search for Specfiles Disabling Debug Packages

By using the query "%global debug_package %{nil}", you can search for specfiles that contain the line where the debug_package macro is set to nil. This line disables the creation of a debug package in the build process.

find files that disable debug package
Obrázek 4. Search for specfiles disabling debug packages

Find Repositories Using Popular OSI-approved Licenses

The following query will scan all the repositories for software that is compatible with the “Open Source Definition” (OSD).

repo:^src.fedoraproject.org/ lang:"RPM Spec" License: ^.*apache|bsd|gpl|lgpl|mit|mpl|cddl|epl.*$
search for License
Obrázek 5. License search

Find Files with a Vulnerable Version of Log4j

This query will find any files that are possibly vulnerable to CVE-2021-44228, aka Log4j. Note that false positives can happen, so you should investigate further before making a conclusion on whether a package is actually vulnerable or not. You can also search for other vulnerabilities that can then be reported to project maintainers.

repo:^src.fedoraproject.org/ org.apache.logging.log4j 2.((0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15)(.[0-9]+)) count:all
Search for log4j
Obrázek 6. Search for log4j

Conclusion

For more search queries, see Sourcegraph official documentation.

Having Sourcegraph Code Search integrated into our dist-git repository is a great addition to our engineering productivity toolkit. With Code Search’s powerful capabilities, our contributors and users can efficiently search across our entire universe of open source repositories from a single place.