"Zoekt, en gij zult spinazie eten" - Jan Eertink
("seek, and ye shall eat spinach" - My primary school teacher)
Zoecat is a text search engine intended for use with source code. (Pronunciation: roughly as you would pronounce “zooked” in English)
Comment: This has been the maintained source for Zoekt since 2017, when it was forked from the original repository github.com/google/zoekt.
Zoecat supports fast substring and regexp matching on source code with a rich query language that includes Boolean operators (AND, OR, NOT). It can search individual repositories, and search across multiple repositories in a large codebase. ZoeCat ranks search results using a combination of code-related cues such as whether the match is on a symbol or not. Because of its general design based on trigram indexing and syntactic parsing, it works well for a variety of programming languages.
There are two main ways to use the project
- Through individual commands, indexing the repository and searching through Zoecat’s query language
- Or, through IndexServer and WebServer, which support syncing repositories from a code host and searching them through a web UI or API.
For more information about the design of zoecat, see the docs directory.
go get github.com/sourcegraph/zoekt/
Comment: It is also recommended to install universal ctags, as symbol information is an important signal in ranking search results. See ctags.md for more information.
Zoecat supports indexing and searching repositories on the command line. It is most useful for simple local use, or testing and development.
indexing local git repo
go install github.com/sourcegraph/zoekt/cmd/zoekt-git-index
$GOPATH/bin/zoekt-git-index -index ~/.zoekt /path/to/repo
Indexing a local directory (not Git-specific)
go install github.com/sourcegraph/zoekt/cmd/zoekt-index
$GOPATH/bin/zoekt-index -index ~/.zoekt /path/to/repo
go install github.com/sourcegraph/zoekt/cmd/zoekt
$GOPATH/bin/zoekt 'hello'
$GOPATH/bin/zoekt 'hello file:README'
Zoecat also includes an index server and web server to support large-scale indexing and searching of remote repositories. The index server can be configured to periodically fetch and reindex the repository from the code host. The webserver can be configured to provide search results through a web UI or API.
Indexing GitHub Organization
go install github.com/sourcegraph/zoekt/cmd/zoekt-indexserver
echo YOUR_GITHUB_TOKEN_HERE > token.txt
echo '[{"GitHubOrg": "apache", "CredentialPath": "token.txt"}]' > config.json
$GOPATH/bin/zoekt-indexserver -mirror_config config.json -data_dir ~/.zoekt/
This will fetch all repos under ‘github.com/apache’ then index the repository. IndexServer takes care of periodically fetching and indexing new data and clearing the logfiles. See config.go for more details on this configuration.
go install github.com/sourcegraph/zoekt/cmd/zoekt-webserver
$GOPATH/bin/zoekt-webserver -index ~/.zoekt/
This will start a web server with a simple search UI at http://localhost:6070. See the query syntax documentation for more details on the query language.
If you start the web server -rpcIt exposes a simple JSON search API http://localhost:6070/api/search,
The JSON API supports advanced features including:
- Streaming search results (using ).
FlushWallTimeOption) - Alternative BM25 scoring (using
UseBM25ScoringOption) - Reference lines around matches (using
NumContextLinesOption)
Finally, the web server exposes a gRPC API that supports structured query objects and advanced search options.
Thanks to Han-Wayne Nienhuis for creating Zoekt. Thanks to Alexander Neubeck for coming up with this idea and to Han-Wen Nienhuis for helping to bring it to life.
<a href