Index
Contributions to G4X-helpers are welcome!
This section provides some guidelines and tips to follow when you wish to enrich the project with your own code or documentation.
Index
- Development workflow
- Working with git
- Building and managing your environment
- Documentation
- Releasing a new version
Info
Parts of these guidelines have been adapted from the scanpy docs, which in turn built on the work done by pandas and MDAnalysis.
We highly recommend checking out these excellent guides to learn more.
Development workflow
The life-cycle of a new feature or other contribution should follow this pattern:
- Fork the
G4X-helpers
repository to your own GitHub account - Create an environment with all dev-dependencies
- Create a new branch for your feature or bugfix
- Commit your contribution to the codebase
- Update and check the documentation
- Open a PR back to the main repository
Working with git
This section of the docs covers our practices for working with git
on our codebase.
For more in-depth guides, we can recommend a few sources:
Atlassian's git tutorial : Beginner friendly introductions to the git command line interface
Setting up git for GitHub : Configuring git to work with your GitHub user account
Forking and cloning
To get the code, and be able to push changes back to the main project, you'll need to (1) fork the repository on github and (2) clone the repository to your local machine.
This is very straight forward if you're using GitHub's CLI:
This will fork the repo to your github account, create a clone of the repo on your current machine, add our repository as a remote, and set the main
development branch to track our repository.
To do this manually, first make a fork of the repository by clicking the "fork" button on our main github package. Then, on your machine, run:
# Clone your fork of the repository (substitute in your username)
$ git clone https://github.com/{your-username}/G4X-helpers.git
# Enter the cloned repository
$ cd G4X-helpers
# Add our repository as a remote
$ git remote add upstream https://github.com/Singular-Genomics/G4X-helpers.git
Creating a branch for your feature
All development should occur in branches dedicated to the particular work being done.
Additionally, unless you are a maintainer, all changes should be directed at the main
branch.
You can create a branch with:
$ git checkout main # Starting from the main branch
$ git pull # Syncing with the repo
$ git switch -c {your-branch-name} # Making and changing to the new branch
Committing your code
Keep commits small, focused, and well-described. This makes code review easier and history clearer. When you are ready, add the files that belong to your commit:
$ git status # See what changed
$ git add -p # Interactively stage only the hunks you want
# or everything
$ git add .
Write a clear commit-message. Use an imperative, one-line summary (≤ 72 chars).
Tip
Need to fix the last commit before pushing?
git commit --amend
lets you change the message or add more files.
Opening a pull request
When you're ready to have your code reviewed, push your changes up to your fork:
# The first time you push the branch, you'll need to tell git where
$ git push --set-upstream origin {your-branch-name}
# After that, just use
$ git push
Then open a pull request by going to the main repo and clicking New pull request
.
GitHub may also prompt you to open PRs for recently pushed branches.
Info
It is important to summarize your changes in the description of the PR so that they get included in the next change-log
We'll try and get back to you soon!
Building and managing your environment
Installing project dependencies
It is recommended to develop your feature in an isolated virtual environment. There are many environment managers available for Python (conda, pyenv, Virtualenv ...)
We recommend using uv, which can manage your virtual environment and use the project's uv.lock
file to replicate all dependencies from exact sources.
After installing uv, you can build the environment by calling:
A folder named .venv
will be created. It holds the correct python version and all project dependencies. It will also install necessary development tools like ruff
, mkdocs
, pre-commit
, bump-my-version
.
You can now activate this environment with:
Using pre-commit hooks
We use pre-commit to run various checks on new code.
In order for it to attach to your commits automatically, you need to install it once after building your environment.
While most rules will be applied automatically, some checks may prevent your code from being committed. The pre-commit output will help you identify which sections need to be addressed.
If you choose not to run the hooks on each commit, you can run them manually with
pre-commit run --files={your files}
.
Note
If your environment manager did not install pre-commit as a dependency, you can do so via:
Code formatting and linting
We use Ruff to format and lint the G4X-helpers
codebase. Ruff is a project dependency and its rules are configured in ruff.toml
. It will be invoked on all code contributions via pre-commit hooks (see above) but you can also run it manually via ruff check
.
Documentation
docstrings
We prefer the numpydoc style for writing docstrings. We'd primarily suggest looking at existing docstrings for examples, but the napolean guide to numpy style docstrings is also a great source. If you're unfamiliar with the reStructuredText (rST) markup format, check out the Sphinx rST primer.
Look at sc.tl.leiden
as an example of a complete doctring.
Params
section
The Params
abbreviation is a legit replacement for Parameters
.
To document parameter types use type annotations on function parameters. These will automatically populate the docstrings on import, and when the documentation is built.
Use the python standard library types (defined in collections.abc
and typing
modules) for containers, e.g.
collections.abc.Sequence
s (likelist
),collections.abc.Iterable
s (likeset
), andcollections.abc.Mapping
s (likedict
).
Always specify what these contain, e.g. {'a': (1, 2)}
→ Mapping[str, Tuple[int, int]]
.
If you can’t use one of those, use a concrete class like AnnData
.
If your parameter only accepts an enumeration of strings, specify them like so: Literal['elem-1', 'elem-2']
.
Returns
section
- Function returns nothing? Use None.
- Single object: pd.DataFrame — description on the next line.
- Multiple values:
- Prefer a named tuple/dataclass.
- Otherwise list each element on its own line:
Returns
-------
norm : AnnData
Normalized copy of the input.
stats : dict
Summary statistics (mean, var, n_cells).
Releasing and versioning
Versioning and release tagging in G4X-helpers is handled through bump-my-version
and maintainers handle version bumps and publishing releases.
Please do not change the project version or the changelog.
Just submit your code/docs and they will be incorporated into our release workflow.
⸻