From 32920009dfe0c21874ae16ed03f4b2490a482cb8 Mon Sep 17 00:00:00 2001
From: Joseph Tran <joseph.tran@inrae.fr>
Date: Sat, 7 Jan 2023 22:55:21 +0100
Subject: [PATCH 1/7] use honkit to build gitlab pages

---
 .gitignore |   2 +
 honkit.md  | 176 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 178 insertions(+)
 create mode 100644 honkit.md

diff --git a/.gitignore b/.gitignore
index 35ee3c8..61e23c4 100644
--- a/.gitignore
+++ b/.gitignore
@@ -54,6 +54,8 @@
 # 3) add a pattern to track the file patterns of section2 even if they are in
     # subdirectories
 !*/
+node_modules
+node_modules/*
 
 # 4) specific files or folder to TRACK (the '**' sign means 'any path')
 
diff --git a/honkit.md b/honkit.md
new file mode 100644
index 0000000..70fc45a
--- /dev/null
+++ b/honkit.md
@@ -0,0 +1,176 @@
+# HonKit
+
+HonKit is building beautiful books using GitHub/Git and Markdown.
+
+![HonKit Screenshot](./honkit.png)
+
+## Documentation and Demo
+
+HonKit documentation is built by HonKit!
+
+- <https://honkit.netlify.app/>
+
+## Quick Start
+
+### Installation
+
+- Requirement: [Node.js](https://nodejs.org) [LTS](https://nodejs.org/about/releases/) version
+
+The best way to install HonKit is via **NPM** or **Yarn**.
+
+```
+$ npm init --yes
+$ npm install honkit --save-dev
+```
+
+⚠️ Warning:
+
+- If you have installed `honkit` globally, you must install each plugins globally as well
+- If you have installed `honkit` locally, you must install each plugins locally as well
+
+We recommend installing `honkit` locally.
+
+### Create a book
+
+HonKit can set up a boilerplate book:
+
+```
+$ npx honkit init
+```
+
+If you wish to create the book into a new directory, you can do so by running `honkit init ./directory`
+
+Preview and serve your book using:
+
+```
+$ npx honkit serve
+```
+
+Or build the static website using:
+
+```
+$ npx honkit build
+```
+
+You can start to write your book!
+
+For more details, see [HonKit's documentation](https://honkit.netlify.app/).
+
+## Docker support
+
+Honkit provide docker image at [honkit/honkit](https://hub.docker.com/r/honkit/honkit).
+
+This docker image includes built-in dependencies for PDF/epub.
+
+```
+docker pull honkit/honkit
+docker run -v `pwd`:`pwd` -w `pwd` --rm -it honkit/honkit honkit build
+docker run -v `pwd`:`pwd` -w `pwd` --rm -it honkit/honkit honkit pdf
+```
+
+For more details, see [docker/](./docker/).
+
+## Usage examples
+
+HonKit can be used to create a book, public documentation, enterprise manual, thesis, research papers, etc.
+
+You can find a list of [real-world examples](https://honkit.netlify.app/examples.html) in the documentation.
+
+## Features
+
+* Write using [Markdown](https://honkit.netlify.app/syntax/markdown.html) or [AsciiDoc](https://honkit.netlify.app/syntax/asciidoc.html)
+* Output as a website or [ebook (pdf, epub, mobi)](https://honkit.netlify.app/ebook.html)
+* [Multi-Languages](https://honkit.netlify.app/languages.html)
+* [Lexicon / Glossary](https://honkit.netlify.app/lexicon.html)
+* [Cover](https://honkit.netlify.app/ebook.html)
+* [Variables and Templating](https://honkit.netlify.app/templating/)
+* [Content References](https://honkit.netlify.app/templating/conrefs.html)
+* [Plugins](https://honkit.netlify.app/plugins/)
+* [Beautiful default theme](./packages/@honkit/theme-default)
+
+## Fork of GitBook
+
+HonKit is a fork of [GitBook (Legacy)](https://github.com/GitbookIO/gitbook).
+[GitBook (Legacy)](https://github.com/GitbookIO/gitbook) is [deprecated](https://github.com/GitbookIO/gitbook/commit/6c6ef7f4af32a2977e44dd23d3feb6ebf28970f4) and an inactive project.
+
+HonKit aims to smooth the migration from GitBook (Legacy) to HonKit.
+
+### Compatibility with GitBook
+
+- Almost all plugins work without changes!
+- Support `gitbook-plugin-*` packages
+    - You should install these plugins via npm or yarn
+    - `npm install gitbook-plugin-<example> --save-dev`
+
+### Differences with GitBook
+
+- Node.js 14+ supports
+- Improve `build`/`serve` performance
+    - `honkit build`: use file cache by default
+    - `honkit serve`: 28.2s → 0.9s in [examples/benchmark](examples/benchmark)
+    - Also, support `--reload` flag for force refresh
+- Improve plugin loading logic
+    - Reduce cost of finding `honkit-plugin-*` and `gitbook-plugin-*`
+    - Support `honkit-plugin-*` and `@scope/honkit-plugin-*` (GitBook does not support a scoped module)
+- Remove `install` command
+    - Instead of it, just use `npm install` or `yarn install`
+- Remove `global-npm` dependency
+    - You can use HonKit with another npm package manager like `yarn`
+- Update dependencies
+    - Upgrade to nunjucks@2, highlight.js etc...
+    - It will reduce bugs
+- TypeScript
+    - Rewritten by TypeScript
+- Monorepo codebase
+    - Easy to maintain
+- [Docker support](./docker)
+
+### Migration from GitBook
+
+Replace `gitbook-cli` with `honkit`.
+
+```
+npm uninstall gitbook-cli
+npm install honkit --save-dev
+```
+
+Replace `gitbook` command with `honkit` command.
+
+```diff
+  "scripts": {
+-    "build": "gitbook build",
++    "build": "honkit build",
+-    "serve": "gitbook serve"
++    "serve": "honkit serve"
+  },
+```
+
+After that, HonKit just works!
+
+Examples of migration:
+
+- [Add a Github action to deploy · DjangoGirls/tutorial](https://github.com/DjangoGirls/tutorial/pull/1666)
+- [Migrate from GitBook to Honkit · swaroopch/byte-of-python](https://github.com/swaroopch/byte-of-python/pull/88)
+- [replace Gitbook into Honkit · yamat47/97-things-every-programmer-should-know](https://github.com/yamat47/97-things-every-programmer-should-know/pull/2)
+- [Migrate misp-book from GitBook to honkit](https://github.com/MISP/misp-book/pull/227)
+
+## Benchmarks
+
+`honkit build` benchmark:
+
+- <https://honkit.github.io/honkit/dev/bench/>
+
+## Licensing
+
+HonKit is licensed under the Apache License, Version 2.0. See [LICENSE](LICENSE) for the full license text.
+
+HonKit is a fork of [GitBook (Legacy)](https://github.com/GitbookIO/gitbook).
+GitBook is licensed under the Apache License, Version 2.0.
+
+Also, HonKit includes [bignerdranch/gitbook](https://github.com/bignerdranch/gitbook) works.
+
+## Sponsors
+
+<a href="https://www.netlify.com">
+<img src="https://www.netlify.com/img/global/badges/netlify-color-bg.svg" alt="Deploys by Netlify" />
+</a>
-- 
GitLab


From 2b3aecaceec0def444be2a5f10cd2f4b3922b08b Mon Sep 17 00:00:00 2001
From: Joseph Tran <joseph.tran@inrae.fr>
Date: Sat, 7 Jan 2023 22:59:18 +0100
Subject: [PATCH 2/7] move and add wiki pages into workflow doc

---
 SUMMARY.md                                  |  19 +++
 workflow/doc/Assembly-Mode/Hi-C-tutorial.md |  44 +++++++
 workflow/doc/Assembly-Mode/Trio-tutorial.md |  45 +++++++
 workflow/doc/Going-further.md               | 139 ++++++++++++++++++++
 workflow/doc/Known-errors.md                |   7 +
 workflow/doc/Outputs.md                     |  45 +++++++
 workflow/doc/Programs.md                    |  57 ++++++++
 workflow/doc/Quick-start.md                 |  75 +++++++++++
 workflow/doc/Tar-data-preparation.md        |  32 +++++
 workflow/documentation.md                   |  26 ++++
 10 files changed, 489 insertions(+)
 create mode 100644 SUMMARY.md
 create mode 100644 workflow/doc/Assembly-Mode/Hi-C-tutorial.md
 create mode 100644 workflow/doc/Assembly-Mode/Trio-tutorial.md
 create mode 100644 workflow/doc/Going-further.md
 create mode 100644 workflow/doc/Known-errors.md
 create mode 100644 workflow/doc/Outputs.md
 create mode 100644 workflow/doc/Programs.md
 create mode 100644 workflow/doc/Quick-start.md
 create mode 100644 workflow/doc/Tar-data-preparation.md
 create mode 100644 workflow/documentation.md

diff --git a/SUMMARY.md b/SUMMARY.md
new file mode 100644
index 0000000..7023558
--- /dev/null
+++ b/SUMMARY.md
@@ -0,0 +1,19 @@
+# Summary
+
+* [Introduction](README.md)
+* [Documentation summary](workflow/documentation.md)
+    * [Requirements](workflow/documentation.md#asm4pg-requirements)
+    * [Tutorials](workflow/documentation.md#tutorials)
+        * [Quick start](workflow/doc/Quick-start.md)
+        * [Hi-C mode](workflow/doc/Assembly-Mode/Hi-C-tutorial.md)
+        * [Trio mode](workflow/doc/Assembly-Mode/Trio-tutorial.md)
+    * [Outputs](workflow/documentation.md#outputs)
+        * [Workflow output](workflow/doc/Outputs.md)
+    * [Optional data preparation](workflow/documentation.md#optional-data-preparation)
+        * [if your data is in a tarball archive](workflow/doc/Tar-data-preparation.md)
+    * [Troubleshooting](workflow/documentation.md#known-errors)
+        * [known errors](workflow/doc/Known-errors.md)
+    * [Software Dependencies](workflow/documentation.md#programs)
+        * [Programs listing](workflow/doc/Programs.md)
+* [Gitlab pages using honkit](honkit.md)
+
diff --git a/workflow/doc/Assembly-Mode/Hi-C-tutorial.md b/workflow/doc/Assembly-Mode/Hi-C-tutorial.md
new file mode 100644
index 0000000..154edf8
--- /dev/null
+++ b/workflow/doc/Assembly-Mode/Hi-C-tutorial.md
@@ -0,0 +1,44 @@
+Please look at [quick start](../Quick-start) first, some of the steps are omitted here.
+
+This tutorial shows how to use the workflow with hi-c assembly mode which takes PacBio Hifi data and Hi-C data as input.
+
+# 1. Config file
+**TO-DO : add a toy dataset fasta and hi-c.**
+```bash
+cd GenomAsm4pg/.config
+```
+
+Modify `masterconfig.yaml`. The PacBio HiFi file is `toy_dataset_hi-c.fasta`, its name is used as key in config. The Hi-C files are `data_r1.fasta` and `data_r2.fasta`
+
+```yaml
+####################### job - workflow #######################
+### CONFIG
+
+IDS: ["toy_dataset_hi-c"]
+
+toy_dataset_hi-c:
+  fasta: ./GenomAsm4pg/tutorial_data/hi-c/toy_dataset_hi-c.fasta
+  run: hi-c_tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: hi-c
+  r1: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+  r2: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+```
+
+# 2. Dry run
+To check the config, first do a dry run of the workflow.
+
+```bash
+sbatch job.sh dry
+```
+# 3. Run 
+If the dry run is successful, you can run the workflow.
+
+```bash
+sbatch job.sh
+```
+
+# Other assembly modes
+If you want to use parental data, follow the [Trio assembly mode tutorial](../Assembly-Mode/Trio-tutorial).
+To go further with the workflow use go [here](../Going-further).
diff --git a/workflow/doc/Assembly-Mode/Trio-tutorial.md b/workflow/doc/Assembly-Mode/Trio-tutorial.md
new file mode 100644
index 0000000..3364731
--- /dev/null
+++ b/workflow/doc/Assembly-Mode/Trio-tutorial.md
@@ -0,0 +1,45 @@
+Please look at [quick start](../../Quick-start) first, some of the steps are omitted here.
+
+This tutorial shows how to use the workflow with hi-c assembly mode which takes PacBio Hifi data and Hi-C data as input.
+
+# 1. Config file
+**TO-DO : add a toy dataset fasta and parental fasta.**
+```bash
+cd GenomAsm4pg/.config
+```
+
+Modify `masterconfig.yaml`. The PacBio HiFi file is `toy_dataset_trio.fasta`, its name is used as key in config. The parental reads files are `data_p1.fasta` and `data_p2.fasta`.
+Parental data is used as k-mers, you use Illumina or PacBio Hifi reads.
+
+```yaml
+####################### job - workflow #######################
+### CONFIG
+
+IDS: ["toy_dataset_trio"]
+
+toy_dataset_trio:
+  fasta: ./GenomAsm4pg/tutorial_data/trio/toy_dataset_trio.fasta
+  run: trio_tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: trio
+  p1: ./GenomAsm4pg/tutorial_data/trio/data_p1.fasta
+  p2: ./GenomAsm4pg/tutorial_data/trio/data_p2.fasta
+```
+
+# 2. Dry run
+To check the config, first do a dry run of the workflow.
+
+```bash
+sbatch job.sh dry
+```
+# 3. Run 
+If the dry run is successful, you can run the workflow.
+
+```bash
+sbatch job.sh
+```
+
+# Other assembly modes
+If you want to use Hi-C data, follow the [Hi-C assembly mode tutorial](doc/Assembly-Mode/Hi-C-tutorial).
+To go further with the workflow use go [here](doc/Going-further).
diff --git a/workflow/doc/Going-further.md b/workflow/doc/Going-further.md
new file mode 100644
index 0000000..e5e6cd4
--- /dev/null
+++ b/workflow/doc/Going-further.md
@@ -0,0 +1,139 @@
+[TOC]
+
+# 1. Multiple datasets
+You can run the workflow on multiple datasets at the same time.
+
+## 1.1. All datasets
+With `masterconfig.yaml` as follow, running the workflow will assemble each dataset in its specific assembly mode.
+You can add as many datasets as you want, each with different parameters. 
+
+```yaml
+IDS: ["toy_dataset", "toy_dataset_hi-c", "toy_dataset_trio"]
+
+toy_dataset:
+  fasta: "./GenomAsm4pg/tutorial_data/toy_dataset.fasta"
+  run: tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: default
+
+toy_dataset_hi-c:
+  fasta: ./GenomAsm4pg/tutorial_data/hi-c/toy_dataset_hi-c.fasta
+  run: hi-c_tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: hi-c
+  r1: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+  r2: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+
+toy_dataset_trio:
+  fasta: ./GenomAsm4pg/tutorial_data/trio/toy_dataset_trio.fasta
+  run: trio_tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: trio
+  p1: ./GenomAsm4pg/tutorial_data/trio/data_p1.fasta
+  p2: ./GenomAsm4pg/tutorial_data/trio/data_p2.fasta
+```
+## 1.2. On chosen datasets
+You can remove dataset from IDS to assemble only chosen genomes:
+```yaml
+IDS: ["toy_dataset", "toy_dataset_trio"]
+
+toy_dataset:
+  fasta: "./GenomAsm4pg/tutorial_data/toy_dataset.fasta"
+  run: tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: default
+
+toy_dataset_hi-c:
+  fasta: ./GenomAsm4pg/tutorial_data/hi-c/toy_dataset_hi-c.fasta
+  run: hi-c_tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: hi-c
+  r1: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+  r2: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+
+toy_dataset_trio:
+  fasta: ./GenomAsm4pg/tutorial_data/trio/toy_dataset_trio.fasta
+  run: trio_tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: trio
+  p1: ./GenomAsm4pg/tutorial_data/trio/data_p1.fasta
+  p2: ./GenomAsm4pg/tutorial_data/trio/data_p2.fasta
+```
+Running the workflow with this config will assemble only `toy_dataset` and `toy_dataset_trio`.
+
+# 2. Different run names
+If you want to try different parameters on the same dataset, changing the run name will create a new directory and keep the previous data.
+
+In the [Hi-C tutorial](), we used the following config.
+```yaml
+IDS: ["toy_dataset_hi-c"]
+
+toy_dataset_hi-c:
+  fasta: ./GenomAsm4pg/tutorial_data/hi-c/toy_dataset_hi-c.fasta
+  run: hi-c_tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: hi-c
+  r1: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+  r2: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+```
+
+If you want to compare the Hi-C and default assembly modes, you can run the workflow with a different run name and the default mode.
+```yaml
+IDS: ["toy_dataset_hi-c"]
+
+toy_dataset_hi-c:
+  fasta: ./GenomAsm4pg/tutorial_data/hi-c/toy_dataset_hi-c.fasta
+  run: default_comparaison
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: default
+```
+You will end up with 2 sub-directories for toy_dataset_hi-c (`hi-c_tutorial` and `default_comparaison`) and keep the data from the previous run in Hi-C mode.
+
+# 3. The same dataset with different parameters at once
+If you want to do the previous example in one run, you will have to create a symbolic link to the fasta with a different filename.
+
+YAML files do not allow multiple uses of the same key. The following config does not work.
+```yaml
+## DOES NOT WORK
+IDS: ["toy_dataset_hi-c"]
+
+toy_dataset_hi-c:
+  run: hi-c_tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: hi-c
+  r1: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+  r2: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
+
+toy_dataset_hi-c:
+  run: default_comparaison
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: default
+```
+
+**TO COMPLETE**
+
+# 4. Optional fastq and bam files
+If fastq and bam are available and you want to do raw QC with fastQC and longQC, add the `fastq` and/or `bam` key in your config. The fasta, fastq and bam filenames have to be the same. For example:
+
+```yaml
+IDS: ["toy_dataset"]
+
+toy_dataset:
+  fasta: "./GenomAsm4pg/tutorial_data/toy_dataset.fasta"
+  fastq: "./GenomAsm4pg/tutorial_data/toy_dataset.fastq"
+  bam: "./GenomAsm4pg/tutorial_data/toy_dataset.bam"
+  run: tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: default
+```
\ No newline at end of file
diff --git a/workflow/doc/Known-errors.md b/workflow/doc/Known-errors.md
new file mode 100644
index 0000000..ef83716
--- /dev/null
+++ b/workflow/doc/Known-errors.md
@@ -0,0 +1,7 @@
+[TOC]
+
+# One of the BUSCO rules failed
+The first time you run the workflow, the BUSCO lineage might be downloaded multiple times. This can create a conflict between the jobs using BUSCO and may interrupt some of them. In that case, you only need to rerun the workflow once everything is done.
+
+# Snakemake locked directory
+When you try to rerun the workflow after cancelling a job, you may have to unlock the results directory. To do so, go in `.config/snakemake_profile/slurm` and uncomment line 14 of `config.yaml`. Run the workflow once to unlock the directory (it should only take a few seconds). Still in `config.yaml`, comment line 14. The workflow will be able to run and create outputs.
\ No newline at end of file
diff --git a/workflow/doc/Outputs.md b/workflow/doc/Outputs.md
new file mode 100644
index 0000000..d27e5eb
--- /dev/null
+++ b/workflow/doc/Outputs.md
@@ -0,0 +1,45 @@
+[TOC]
+
+# Directories
+There are three directories for the data produced by the workflow:
+- An automatic report is generated in the `RUN` directory.
+- `01_raw_data_QC` contains all quality control ran on the reads. FastQC and LongQC create HTML reports on fastq and bam files respectively, reads stats are given by Genometools, and predictions of genome size and heterozygosity are given by Genomescope (in directory `04_kmer`).
+- `02_genome_assembly` contains 2 assemblies. The first one is in `01_raw_assembly`, it is the assembly obtained with hifiasm. The second one is in `02_after_purge_dups_assembly`, it is the hifiasm assembly after haplotigs removal by purge_dups. Both assemblies have a `01_assembly_QC` directory containing assembly statistics done by Genometools (in directory `assembly_stats`), BUSCO analyses (`busco`), k-mer profiles with KAT (`katplot`) and completedness and QV stats with Merqury (`merqury`) as well as assembled telomeres with FindTelomeres (`telomeres`).
+- `benchmark` contains main programs runtime
+
+```
+workflow_results
+├── 00_input_data
+└── FILENAME
+    └── RUN
+        ├── 01_raw_data_QC
+        │   ├── 01_fastQC
+        │   ├── 02_longQC
+        │   ├── 03_genometools
+        |   └── 04_kmer
+        |       └── genomescope
+        └── 02_genome_assembly
+            ├── 01_raw_assembly
+            │   ├── 00_assembly
+            |   └── 01_assembly_QC
+            |       ├── assembly_stats
+            |       ├── busco
+            |       ├── katplot
+            |       ├── merqury
+            |       └── telomeres
+            └── 02_after_purge_dups_assembly
+                ├── 00_assembly
+                |   ├── hap1
+                |   └── hap2
+                └── 01_assembly_QC
+                    ├── assembly_stats
+                    ├── busco
+                    ├── katplot
+                    ├── merqury
+                    └── telomeres
+```
+
+# Additional files
+- Symbolic links to haplotype 1 and haplotype 2 assemblies after purge_dups
+- HTML report with the main results from each program
+- Runtime file with the total workflow runtime for the dataset
diff --git a/workflow/doc/Programs.md b/workflow/doc/Programs.md
new file mode 100644
index 0000000..80e0469
--- /dev/null
+++ b/workflow/doc/Programs.md
@@ -0,0 +1,57 @@
+# Workflow steps and program versions
+All images here will be pulled automatically by Snakemake the first time you run the workflow. It may take some time. Images are only downloaded once and reused automatically by the workflow.
+Images are stored on the project's container registry but come from various container libraries:
+
+## 1. Pre-assembly
+- Conversion of PacBio bam to fasta & fastq
+    - [smrtlink](https://www.pacb.com/support/software-downloads/) 9.0.0
+- Fastq to fasta conversion
+    - [seqtk](https://github.com/lh3/seqtk) 1.3
+- Raw data quality control
+    - [fastqc](https://github.com/s-andrews/FastQC) 0.11.5
+    - [lonqQC](https://github.com/yfukasawa/LongQC) 1.2.0c
+- Metrics
+    - [genometools](https://github.com/genometools/genometools) 1.5.9
+- K-mer analysis
+    - [jellyfish](https://github.com/gmarcais/Jellyfish) 2.3.0
+    - [genomescope](https://github.com/tbenavi1/genomescope2.0) 2.0
+
+## 2. Assembly
+- Assembly
+    - [hifiasm](https://github.com/chhylp123/hifiasm) 0.16.1
+- Metrics
+    - [genometools](https://github.com/genometools/genometools) 1.5.9
+- Assembly quality control
+    - [BUSCO](https://gitlab.com/ezlab/busco) 5.3.1
+    - [KAT](https://github.com/TGAC/KAT) 2.4.1
+- Error rate, QV & phasing
+    - [meryl](https://github.com/marbl/meryl) and [merqury](https://github.com/marbl/merqury) 1.3
+- Detect assembled telomeres
+    - [FindTelomeres](https://github.com/JanaSperschneider/FindTelomeres)
+        - **Biopython** 1.75 
+- Haplotigs and overlaps purging 
+    - [purge_dups](https://github.com/dfguan/purge_dups) 1.2.5
+        - **matplotlib** 0.11.5
+
+## 3. Report
+- **R markdown** 4.0.3
+
+# Docker images
+The programs are pulled automatically as images by Snakemake the first time you run the workflow. It may take some time. Images are only downloaded once and reused automatically by the workflow.
+Images are stored on the project's container registry but come from various container libraries:
+
+- [smrtlink](https://hub.docker.com/r/bryce911/smrtlink/tags)
+- [seqtk](https://hub.docker.com/r/nanozoo/seqtk)
+- [fastqc](https://hub.docker.com/r/biocontainers/fastqc/tags)
+- [lonqQC](https://hub.docker.com/r/grpiccoli/longqc/tags)
+- [genometools](https://hub.docker.com/r/biocontainers/genometools/tags)
+- [jellyfish](https://quay.io/repository/biocontainers/kmer-jellyfish?tab=tags)
+- [genomescope](https://hub.docker.com/r/abner12/genomescope)
+- [hifiasm](https://quay.io/repository/biocontainers/hifiasm?tab=tags)
+- [BUSCO](https://hub.docker.com/r/ezlabgva/busco/tags)
+- [KAT](https://quay.io/repository/biocontainers/kat)
+- [meryl and merqury](https://quay.io/repository/biocontainers/merqury?tab=tags)
+- [Biopython for FindTelomeres](https://quay.io/repository/biocontainers/biopython?tab=tags)
+- [purge_dups](https://quay.io/repository/biocontainers/purge_dups?tab=tags)
+- [matplotlib as companion to purge_dups](https://hub.docker.com/r/biocontainers/matplotlib-venn/tags)
+- [R markdown](https://hub.docker.com/r/reslp/rmarkdown/tags)
diff --git a/workflow/doc/Quick-start.md b/workflow/doc/Quick-start.md
new file mode 100644
index 0000000..29f6bc4
--- /dev/null
+++ b/workflow/doc/Quick-start.md
@@ -0,0 +1,75 @@
+This tutorial shows how to use the workflow with default assembly mode which takes PacBio Hifi data as input.
+
+[TOC]
+
+# Clone repository
+```bash
+cd .
+git clone https://forgemia.inra.fr/asm4pg/GenomAsm4pg.git
+```
+
+# 1. Cluster profile setup
+```bash
+cd GenomAsm4pg/.config/snakemake_profile
+```
+The current profile is made for SLURM. If you use it, change line 13 to your email address in the `cluster_config`.yml file.
+
+To run this workflow on another HPC, create another profile (https://github.com/Snakemake-Profiles) and add it in the `.config/snakemake_profile` directory. Change the `CLUSTER_CONFIG` and `PROFILE` variables in `job.sh` and `prejob.sh` scripts.
+
+
+# 2. Config file
+**TO-DO : add a toy fasta.**
+```bash
+cd ..
+```
+
+Modify `masterconfig.yaml`. Root refers to the path for the output data.
+```yaml
+# absolute path to your desired output path
+root: ./GenomAsm4pg/tutorial_output
+```
+
+The reads file is `toy_dataset.fasta`, its name is used as key in config. 
+
+```yaml
+####################### job - workflow #######################
+### CONFIG
+IDS: ["toy_dataset"]
+
+toy_dataset:
+  fasta: "./GenomAsm4pg/tutorial_data/toy_dataset.fasta"
+  run: tutorial
+  ploidy: 2
+  busco_lineage: eudicots_odb10
+  mode: default
+```
+
+# 3. Create slurm_logs directory
+```bash
+cd ..
+mkdir slurm_logs
+```
+SLURM logs for each rule will be in this directory, there are .out and .err files for the worklow (*snakemake.cortex**) and for each rules (*rulename.cortex**).
+
+# 4. Mail setup
+Modify line 17 to your email address in `job.sh`.
+
+# 5. Dry run
+To check the config, first do a dry run of the workflow.
+
+```bash
+sbatch job.sh dry
+```
+# 6. Run 
+If the dry run is successful, check that the `SNG_BIND` variable in `job.sh` is the same as `root` variable in `masterconfig.yaml`. 
+
+If Singularity is not in the HPC environment, add `module load singularity` under `module load snakemake/6.5.1`.
+
+You can run the workflow.
+
+```bash
+sbatch job.sh
+```
+
+# Other assembly modes
+If you want to use additional Hi-C data or parental data, follow the [Hi-C assembly mode tutorial](Documentation/Assembly-Mode/Hi-C-tutorial) or the [Trio assembly mode tutorial](Assembly-Mode/Trio-tutorial.md). To go further with the workflow use go [here](Going-further.md).
diff --git a/workflow/doc/Tar-data-preparation.md b/workflow/doc/Tar-data-preparation.md
new file mode 100644
index 0000000..6a92ec1
--- /dev/null
+++ b/workflow/doc/Tar-data-preparation.md
@@ -0,0 +1,32 @@
+If your data is in a tarball, this companion workflow will extract the data and convert bam files to fastq and fasta if necessary.
+
+[TOC]
+
+# 1. Config file
+```bash
+cd GenomAsm4pg/.config
+```
+Modify the the `data` variable in file `.config/masterconfig.yaml` to be the path to the directory containing all input tar files.
+This workflow can automatically determine the name of files in the specified `data` directory, or run only on given files :
+- `get_all_tar_filename: True` will uncompress all tar files. If you want to choose the the files to uncompress, use `get_all_tar_filename: False` and give the filenames as a list in `tarIDS`
+
+# 2. Run 
+Modify the `SNG_BIND` variable in `prejob.sh`, it has to be the same as the variable `root` in `.config/masterconfig.yaml`. Change line 17 to your email adress.
+If Singularity is not in the HPC environement, add `module load singularity` under Module loading.
+
+Then run
+
+```bash
+sbatch prejob.sh
+```
+
+# 3. Outputs
+This will create multiple directories to prepare the data for the workflow. You will end up with a `bam_files` directory containing all *bam* files, renamed as the tar filename if your data was named "ccs.bam", and a `fastx_files` directory containing all *fasta* and *fastq* files. The `extract` directory contains all other files that were in the tar ball.
+
+```
+workflow_results
+└── 00_raw_data
+	├── bam_files
+	├── extract
+	└── fastx_files
+```
\ No newline at end of file
diff --git a/workflow/documentation.md b/workflow/documentation.md
new file mode 100644
index 0000000..a698a4c
--- /dev/null
+++ b/workflow/documentation.md
@@ -0,0 +1,26 @@
+Asm4pg is an automatic and reproducible genome assembly workflow for pangenomic applications using PacBio HiFi data.
+
+[TOC]
+
+# Asm4pg Requirements
+- snakemake >= 6.5.1
+- singularity
+
+The workflow does not work with HPC that does not allow a job to run other jobs.
+
+# Tutorials
+The three assembly modes from hifiasm are available.
+- [Quick start (default mode)](Quick-start)
+- [Hi-C mode](doc/Assembly-Mode/Hi-C-tutorial)
+- [Trio mode](doc/Assembly-Mode/Trio-tutorial)
+
+# Outputs
+[Workflow outputs](doc/Outputs)
+
+# Optional Data Preparation
+If your [data is in a tarball](doc/Tar-data-preparation)
+# Known errors
+You may run into [these errors](doc/Known-errors)
+
+# Programs
+[Programs used in the workflow](doc/Programs)
-- 
GitLab


From 9d70d4f4ae5d4c66bec24564edc5c6940802463b Mon Sep 17 00:00:00 2001
From: Joseph Tran <joseph.tran@inrae.fr>
Date: Sat, 7 Jan 2023 23:01:28 +0100
Subject: [PATCH 3/7] add gitlab ci cd config to build gitlab pages with honkit

---
 .gitlab-ci.yml | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)
 create mode 100644 .gitlab-ci.yml

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
new file mode 100644
index 0000000..1997dc0
--- /dev/null
+++ b/.gitlab-ci.yml
@@ -0,0 +1,32 @@
+# requiring the environment of NodeJS 10
+image: node:10
+
+# add 'node_modules' to cache for speeding up builds
+cache:
+  paths:
+    - node_modules/ # Node modules and dependencies
+
+before_script:
+  - npm init --yes
+  - npm install honkit --save-dev
+
+test:
+  stage: test
+  script:
+    - npx honkit build . public # build to public path
+  only:
+    - branches # this job will affect every branch except 'main'
+  except:
+    - main
+    
+# the 'pages' job will deploy and build your site to the 'public' path
+pages:
+  stage: deploy
+  script:
+    - npx honkit build . public # build to public path
+  artifacts:
+    paths:
+      - public
+    expire_in: 1 week
+  only:
+    - main # this job will affect only the 'main' branch
-- 
GitLab


From 9711e6f965ed735ef57793669d6e9d5a405d2321 Mon Sep 17 00:00:00 2001
From: Joseph Tran <joseph.tran@inrae.fr>
Date: Mon, 16 Jan 2023 14:33:34 +0100
Subject: [PATCH 4/7] move dag file into workflow doc and update links

---
 README.md                 |   2 +-
 fig/.gitkeep              |   0
 fig/rule_dag.svg          | 493 --------------------------------------
 workflow/documentation.md |  19 +-
 4 files changed, 13 insertions(+), 501 deletions(-)
 delete mode 100644 fig/.gitkeep
 delete mode 100644 fig/rule_dag.svg

diff --git a/README.md b/README.md
index 517a7bb..3ae5d70 100644
--- a/README.md
+++ b/README.md
@@ -5,7 +5,7 @@ This workflow uses [Snakemake](https://snakemake.readthedocs.io/en/stable/) to q
 
 A first script (```prejob.sh```) prepares the data until *fasta.gz* files are obtained. A second script (```job.sh```) runs the genome assembly and stats.
 
-![workflow DAG](fig/rule_dag.svg)
+![workflow DAG](workflow/doc/fig/rule_dag.svg)
 
 ## Table of contents
 [TOC]
diff --git a/fig/.gitkeep b/fig/.gitkeep
deleted file mode 100644
index e69de29..0000000
diff --git a/fig/rule_dag.svg b/fig/rule_dag.svg
deleted file mode 100644
index b225cdd..0000000
--- a/fig/rule_dag.svg
+++ /dev/null
@@ -1,493 +0,0 @@
-<?xml version="1.0" encoding="UTF-8" standalone="no"?>
-<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
- "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
-<!-- Generated by graphviz version 2.40.1 (20161225.0304)
- -->
-<!-- Title: snakemake_dag Pages: 1 -->
-<svg width="1382pt" height="620pt"
- viewBox="0.00 0.00 1382.00 620.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
-<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 616)">
-<title>snakemake_dag</title>
-<polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-616 1378,-616 1378,4 -4,4"/>
-<!-- 0 -->
-<g id="node1" class="node">
-<title>0</title>
-<path fill="none" stroke="#d87556" stroke-width="2" d="M784,-36C784,-36 754,-36 754,-36 748,-36 742,-30 742,-24 742,-24 742,-12 742,-12 742,-6 748,0 754,0 754,0 784,0 784,0 790,0 796,-6 796,-12 796,-12 796,-24 796,-24 796,-30 790,-36 784,-36"/>
-<text text-anchor="middle" x="769" y="-15.5" font-family="sans" font-size="10.00" fill="#000000">all</text>
-</g>
-<!-- 1 -->
-<g id="node2" class="node">
-<title>1</title>
-<path fill="none" stroke="#d8cb56" stroke-width="2" d="M136,-252C136,-252 12,-252 12,-252 6,-252 0,-246 0,-240 0,-240 0,-228 0,-228 0,-222 6,-216 12,-216 12,-216 136,-216 136,-216 142,-216 148,-222 148,-228 148,-228 148,-240 148,-240 148,-246 142,-252 136,-252"/>
-<text text-anchor="middle" x="74" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">genometools_on_raw_data</text>
-</g>
-<!-- 1&#45;&gt;0 -->
-<g id="edge15" class="edge">
-<title>1&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M99.5393,-215.8176C149.1815,-181.4737 263.2946,-107.3462 371,-72 437.7811,-50.0841 646.1884,-29.1986 731.7583,-21.3109"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="732.1653,-24.7884 741.8051,-20.3926 731.5281,-17.8174 732.1653,-24.7884"/>
-</g>
-<!-- 20 -->
-<g id="node21" class="node">
-<title>20</title>
-<path fill="none" stroke="#56d8c9" stroke-width="2" d="M711,-180C711,-180 681,-180 681,-180 675,-180 669,-174 669,-168 669,-168 669,-156 669,-156 669,-150 675,-144 681,-144 681,-144 711,-144 711,-144 717,-144 723,-150 723,-156 723,-156 723,-168 723,-168 723,-174 717,-180 711,-180"/>
-<text text-anchor="middle" x="696" y="-159.5" font-family="sans" font-size="10.00" fill="#000000">report</text>
-</g>
-<!-- 1&#45;&gt;20 -->
-<g id="edge46" class="edge">
-<title>1&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M148.133,-217.4858C151.1229,-216.9584 154.0872,-216.4602 157,-216 344.4541,-186.3819 571.2971,-169.904 658.9529,-164.254"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="659.1831,-167.7465 668.9406,-163.6185 658.7385,-160.7607 659.1831,-167.7465"/>
-</g>
-<!-- 2 -->
-<g id="node3" class="node">
-<title>2</title>
-<path fill="none" stroke="#59d856" stroke-width="2" d="M735.5,-252C735.5,-252 674.5,-252 674.5,-252 668.5,-252 662.5,-246 662.5,-240 662.5,-240 662.5,-228 662.5,-228 662.5,-222 668.5,-216 674.5,-216 674.5,-216 735.5,-216 735.5,-216 741.5,-216 747.5,-222 747.5,-228 747.5,-228 747.5,-240 747.5,-240 747.5,-246 741.5,-252 735.5,-252"/>
-<text text-anchor="middle" x="705" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">genomescope</text>
-</g>
-<!-- 2&#45;&gt;0 -->
-<g id="edge13" class="edge">
-<title>2&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M715.2716,-215.9496C720.8057,-205.6614 727.418,-192.3864 732,-180 748.7262,-134.7848 759.3817,-79.5701 764.7649,-46.6773"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="768.2812,-46.8494 766.3895,-36.4248 761.3675,-45.7538 768.2812,-46.8494"/>
-</g>
-<!-- 2&#45;&gt;20 -->
-<g id="edge44" class="edge">
-<title>2&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M702.7289,-215.8314C701.7664,-208.131 700.6218,-198.9743 699.5521,-190.4166"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="703.0151,-189.9019 698.3017,-180.4133 696.0691,-190.7702 703.0151,-189.9019"/>
-</g>
-<!-- 3 -->
-<g id="node4" class="node">
-<title>3</title>
-<path fill="none" stroke="#5673d8" stroke-width="2" d="M720.5,-324C720.5,-324 689.5,-324 689.5,-324 683.5,-324 677.5,-318 677.5,-312 677.5,-312 677.5,-300 677.5,-300 677.5,-294 683.5,-288 689.5,-288 689.5,-288 720.5,-288 720.5,-288 726.5,-288 732.5,-294 732.5,-300 732.5,-300 732.5,-312 732.5,-312 732.5,-318 726.5,-324 720.5,-324"/>
-<text text-anchor="middle" x="705" y="-303.5" font-family="sans" font-size="10.00" fill="#000000">jellyfish</text>
-</g>
-<!-- 3&#45;&gt;2 -->
-<g id="edge16" class="edge">
-<title>3&#45;&gt;2</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M705,-287.8314C705,-280.131 705,-270.9743 705,-262.4166"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="708.5001,-262.4132 705,-252.4133 701.5001,-262.4133 708.5001,-262.4132"/>
-</g>
-<!-- 9 -->
-<g id="node10" class="node">
-<title>9</title>
-<path fill="none" stroke="#56d873" stroke-width="2" d="M632,-252C632,-252 602,-252 602,-252 596,-252 590,-246 590,-240 590,-240 590,-228 590,-228 590,-222 596,-216 602,-216 602,-216 632,-216 632,-216 638,-216 644,-222 644,-228 644,-228 644,-240 644,-240 644,-246 638,-252 632,-252"/>
-<text text-anchor="middle" x="617" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">kat</text>
-</g>
-<!-- 3&#45;&gt;9 -->
-<g id="edge21" class="edge">
-<title>3&#45;&gt;9</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M682.7939,-287.8314C671.955,-278.9632 658.7556,-268.1637 647.0322,-258.5718"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="649.155,-255.7864 639.199,-252.1628 644.7223,-261.2041 649.155,-255.7864"/>
-</g>
-<!-- 15 -->
-<g id="node16" class="node">
-<title>15</title>
-<path fill="none" stroke="#d86656" stroke-width="2" d="M927.5,-252C927.5,-252 886.5,-252 886.5,-252 880.5,-252 874.5,-246 874.5,-240 874.5,-240 874.5,-228 874.5,-228 874.5,-222 880.5,-216 886.5,-216 886.5,-216 927.5,-216 927.5,-216 933.5,-216 939.5,-222 939.5,-228 939.5,-228 939.5,-240 939.5,-240 939.5,-246 933.5,-252 927.5,-252"/>
-<text text-anchor="middle" x="907" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">purge_kat</text>
-</g>
-<!-- 3&#45;&gt;15 -->
-<g id="edge29" class="edge">
-<title>3&#45;&gt;15</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M732.5906,-292.313C736.0602,-290.7755 739.5886,-289.2979 743,-288 793.4297,-268.8133 810.6621,-271.161 865.0412,-252.1161"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="866.2502,-255.4008 874.4805,-248.7291 863.886,-248.8121 866.2502,-255.4008"/>
-</g>
-<!-- 4 -->
-<g id="node5" class="node">
-<title>4</title>
-<path fill="none" stroke="#5663d8" stroke-width="2" d="M304,-252C304,-252 178,-252 178,-252 172,-252 166,-246 166,-240 166,-240 166,-228 166,-228 166,-222 172,-216 178,-216 178,-216 304,-216 304,-216 310,-216 316,-222 316,-228 316,-228 316,-240 316,-240 316,-246 310,-252 304,-252"/>
-<text text-anchor="middle" x="241" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">genometools_on_assembly</text>
-</g>
-<!-- 4&#45;&gt;0 -->
-<g id="edge11" class="edge">
-<title>4&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M256.2603,-215.7724C286.216,-181.3512 356.4925,-107.0975 433,-72 485.8272,-47.7658 655.9142,-28.8862 731.8209,-21.4448"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="732.2681,-24.918 741.8846,-20.4713 731.594,-17.9505 732.2681,-24.918"/>
-</g>
-<!-- 4&#45;&gt;20 -->
-<g id="edge42" class="edge">
-<title>4&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M316.0569,-217.6576C319.0743,-217.0826 322.0642,-216.5276 325,-216 446.618,-194.1436 591.8842,-175.007 658.8425,-166.576"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="659.4136,-170.0319 668.901,-165.316 658.5434,-163.0862 659.4136,-170.0319"/>
-</g>
-<!-- 5 -->
-<g id="node6" class="node">
-<title>5</title>
-<path fill="none" stroke="#9fd856" stroke-width="2" d="M628.5,-540C628.5,-540 551.5,-540 551.5,-540 545.5,-540 539.5,-534 539.5,-528 539.5,-528 539.5,-516 539.5,-516 539.5,-510 545.5,-504 551.5,-504 551.5,-504 628.5,-504 628.5,-504 634.5,-504 640.5,-510 640.5,-516 640.5,-516 640.5,-528 640.5,-528 640.5,-534 634.5,-540 628.5,-540"/>
-<text text-anchor="middle" x="590" y="-519.5" font-family="sans" font-size="10.00" fill="#000000">hap_gfa_to_fasta</text>
-</g>
-<!-- 5&#45;&gt;4 -->
-<g id="edge17" class="edge">
-<title>5&#45;&gt;4</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M555.689,-503.8043C538.1887,-493.9913 516.8811,-481.2054 499,-468 406.8037,-399.9121 309.7037,-304.3778 265.4985,-259.3605"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="267.8629,-256.7723 258.3688,-252.0699 262.8583,-261.6666 267.8629,-256.7723"/>
-</g>
-<!-- 8 -->
-<g id="node9" class="node">
-<title>8</title>
-<path fill="none" stroke="#5692d8" stroke-width="2" d="M487,-324C487,-324 415,-324 415,-324 409,-324 403,-318 403,-312 403,-312 403,-300 403,-300 403,-294 409,-288 415,-288 415,-288 487,-288 487,-288 493,-288 499,-294 499,-300 499,-300 499,-312 499,-312 499,-318 493,-324 487,-324"/>
-<text text-anchor="middle" x="451" y="-303.5" font-family="sans" font-size="10.00" fill="#000000">unzip_hap_fasta</text>
-</g>
-<!-- 5&#45;&gt;8 -->
-<g id="edge20" class="edge">
-<title>5&#45;&gt;8</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M563.6467,-503.9739C551.0609,-494.3891 536.494,-481.7538 526,-468 493.9697,-426.0199 471.354,-368.1635 459.8086,-334.1571"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="463.0483,-332.8064 456.5828,-324.4131 456.403,-335.0064 463.0483,-332.8064"/>
-</g>
-<!-- 5&#45;&gt;9 -->
-<g id="edge22" class="edge">
-<title>5&#45;&gt;9</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M593.0424,-503.5454C594.6818,-493.1306 596.641,-479.8638 598,-468 606.4665,-394.086 612.5189,-306.6126 615.2955,-262.4929"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="618.8053,-262.442 615.9303,-252.2447 611.8187,-262.0091 618.8053,-262.442"/>
-</g>
-<!-- 12 -->
-<g id="node13" class="node">
-<title>12</title>
-<path fill="none" stroke="#56d8b9" stroke-width="2" d="M1021.5,-396C1021.5,-396 972.5,-396 972.5,-396 966.5,-396 960.5,-390 960.5,-384 960.5,-384 960.5,-372 960.5,-372 960.5,-366 966.5,-360 972.5,-360 972.5,-360 1021.5,-360 1021.5,-360 1027.5,-360 1033.5,-366 1033.5,-372 1033.5,-372 1033.5,-384 1033.5,-384 1033.5,-390 1027.5,-396 1021.5,-396"/>
-<text text-anchor="middle" x="997" y="-375.5" font-family="sans" font-size="10.00" fill="#000000">purge_dups</text>
-</g>
-<!-- 5&#45;&gt;12 -->
-<g id="edge26" class="edge">
-<title>5&#45;&gt;12</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M640.7239,-504.0535C720.3611,-475.8772 873.5598,-421.6742 950.5018,-394.4515"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="952.1308,-397.5878 960.3907,-390.9527 949.7959,-390.9886 952.1308,-397.5878"/>
-</g>
-<!-- 13 -->
-<g id="node14" class="node">
-<title>13</title>
-<path fill="none" stroke="#5682d8" stroke-width="2" d="M1040.5,-468C1040.5,-468 953.5,-468 953.5,-468 947.5,-468 941.5,-462 941.5,-456 941.5,-456 941.5,-444 941.5,-444 941.5,-438 947.5,-432 953.5,-432 953.5,-432 1040.5,-432 1040.5,-432 1046.5,-432 1052.5,-438 1052.5,-444 1052.5,-444 1052.5,-456 1052.5,-456 1052.5,-462 1046.5,-468 1040.5,-468"/>
-<text text-anchor="middle" x="997" y="-447.5" font-family="sans" font-size="10.00" fill="#000000">purge_dups_cutoffs</text>
-</g>
-<!-- 5&#45;&gt;13 -->
-<g id="edge27" class="edge">
-<title>5&#45;&gt;13</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M640.7239,-513.0267C714.1887,-500.0305 850.2539,-475.96 931.3067,-461.6214"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="932.1581,-465.0252 941.3955,-459.8367 930.9387,-458.1323 932.1581,-465.0252"/>
-</g>
-<!-- 23 -->
-<g id="node24" class="node">
-<title>23</title>
-<path fill="none" stroke="#afd856" stroke-width="2" d="M577,-468C577,-468 547,-468 547,-468 541,-468 535,-462 535,-456 535,-456 535,-444 535,-444 535,-438 541,-432 547,-432 547,-432 577,-432 577,-432 583,-432 589,-438 589,-444 589,-444 589,-456 589,-456 589,-462 583,-468 577,-468"/>
-<text text-anchor="middle" x="562" y="-447.5" font-family="sans" font-size="10.00" fill="#000000">cp_hap</text>
-</g>
-<!-- 5&#45;&gt;23 -->
-<g id="edge49" class="edge">
-<title>5&#45;&gt;23</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M582.9344,-503.8314C579.874,-495.9617 576.2221,-486.5712 572.8318,-477.8533"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="576.0473,-476.4647 569.1607,-468.4133 569.5232,-479.0019 576.0473,-476.4647"/>
-</g>
-<!-- 6 -->
-<g id="node7" class="node">
-<title>6</title>
-<path fill="none" stroke="#d8bc56" stroke-width="2" d="M605,-612C605,-612 575,-612 575,-612 569,-612 563,-606 563,-600 563,-600 563,-588 563,-588 563,-582 569,-576 575,-576 575,-576 605,-576 605,-576 611,-576 617,-582 617,-588 617,-588 617,-600 617,-600 617,-606 611,-612 605,-612"/>
-<text text-anchor="middle" x="590" y="-591.5" font-family="sans" font-size="10.00" fill="#000000">hifiasm</text>
-</g>
-<!-- 6&#45;&gt;5 -->
-<g id="edge18" class="edge">
-<title>6&#45;&gt;5</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M590,-575.8314C590,-568.131 590,-558.9743 590,-550.4166"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="593.5001,-550.4132 590,-540.4133 586.5001,-550.4133 593.5001,-550.4132"/>
-</g>
-<!-- 7 -->
-<g id="node8" class="node">
-<title>7</title>
-<path fill="none" stroke="#bed856" stroke-width="2" d="M376,-252C376,-252 346,-252 346,-252 340,-252 334,-246 334,-240 334,-240 334,-228 334,-228 334,-222 340,-216 346,-216 346,-216 376,-216 376,-216 382,-216 388,-222 388,-228 388,-228 388,-240 388,-240 388,-246 382,-252 376,-252"/>
-<text text-anchor="middle" x="361" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">busco</text>
-</g>
-<!-- 7&#45;&gt;0 -->
-<g id="edge4" class="edge">
-<title>7&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M368.6017,-215.9044C383.975,-181.7088 422.1989,-107.824 479,-72 520.5285,-45.8083 662.5351,-28.5579 731.208,-21.5482"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="731.9713,-24.9894 741.5733,-20.5113 731.2745,-18.0241 731.9713,-24.9894"/>
-</g>
-<!-- 7&#45;&gt;20 -->
-<g id="edge37" class="edge">
-<title>7&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M388.131,-219.7716C391.4033,-218.3645 394.741,-217.0656 398,-216 489.0354,-186.2333 601.4018,-171.4869 658.6479,-165.4566"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="659.0862,-168.9301 668.678,-164.4304 658.3736,-161.9665 659.0862,-168.9301"/>
-</g>
-<!-- 8&#45;&gt;7 -->
-<g id="edge19" class="edge">
-<title>8&#45;&gt;7</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M428.2892,-287.8314C417.204,-278.9632 403.7046,-268.1637 391.7148,-258.5718"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="393.6987,-255.6768 383.7035,-252.1628 389.3258,-261.1429 393.6987,-255.6768"/>
-</g>
-<!-- 10 -->
-<g id="node11" class="node">
-<title>10</title>
-<path fill="none" stroke="#c6d856" stroke-width="2" d="M483.5,-252C483.5,-252 418.5,-252 418.5,-252 412.5,-252 406.5,-246 406.5,-240 406.5,-240 406.5,-228 406.5,-228 406.5,-222 412.5,-216 418.5,-216 418.5,-216 483.5,-216 483.5,-216 489.5,-216 495.5,-222 495.5,-228 495.5,-228 495.5,-240 495.5,-240 495.5,-246 489.5,-252 483.5,-252"/>
-<text text-anchor="middle" x="451" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">find_telomeres</text>
-</g>
-<!-- 8&#45;&gt;10 -->
-<g id="edge23" class="edge">
-<title>8&#45;&gt;10</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M451,-287.8314C451,-280.131 451,-270.9743 451,-262.4166"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="454.5001,-262.4132 451,-252.4133 447.5001,-262.4133 454.5001,-262.4132"/>
-</g>
-<!-- 9&#45;&gt;0 -->
-<g id="edge14" class="edge">
-<title>9&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M614.6552,-215.7578C611.5127,-183.239 609.7096,-114.6717 643,-72 664.5778,-44.3415 703.0788,-30.6313 731.8617,-23.9764"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="732.9207,-27.3297 741.9769,-21.8311 731.4683,-20.4821 732.9207,-27.3297"/>
-</g>
-<!-- 9&#45;&gt;20 -->
-<g id="edge45" class="edge">
-<title>9&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M636.935,-215.8314C646.4783,-207.1337 658.0599,-196.5783 668.4306,-187.1265"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="671.038,-189.4857 676.0713,-180.1628 666.3228,-184.3121 671.038,-189.4857"/>
-</g>
-<!-- 10&#45;&gt;0 -->
-<g id="edge1" class="edge">
-<title>10&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M452.0719,-215.7403C455.0875,-181.9324 466.55,-109.6148 510,-72 542.8135,-43.5933 668.3208,-27.6995 731.7871,-21.3245"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="732.1709,-24.8038 741.7831,-20.3478 731.4901,-17.8369 732.1709,-24.8038"/>
-</g>
-<!-- 10&#45;&gt;20 -->
-<g id="edge35" class="edge">
-<title>10&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M495.6519,-220.8778C542.5168,-207.1053 615.3375,-185.7049 659.0751,-172.8514"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="660.1331,-176.1886 668.7405,-170.011 658.1594,-169.4726 660.1331,-176.1886"/>
-</g>
-<!-- 11 -->
-<g id="node12" class="node">
-<title>11</title>
-<path fill="none" stroke="#80d856" stroke-width="2" d="M1024,-252C1024,-252 970,-252 970,-252 964,-252 958,-246 958,-240 958,-240 958,-228 958,-228 958,-222 964,-216 970,-216 970,-216 1024,-216 1024,-216 1030,-216 1036,-222 1036,-228 1036,-228 1036,-240 1036,-240 1036,-246 1030,-252 1024,-252"/>
-<text text-anchor="middle" x="997" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">purge_busco</text>
-</g>
-<!-- 11&#45;&gt;0 -->
-<g id="edge7" class="edge">
-<title>11&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M978.5547,-215.8096C947.8417,-185.6125 884.1418,-123.3716 829,-72 818.6228,-62.3323 807.0734,-51.8691 796.9004,-42.7503"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="799.2199,-40.1292 789.4315,-36.0746 794.555,-45.3483 799.2199,-40.1292"/>
-</g>
-<!-- 11&#45;&gt;20 -->
-<g id="edge39" class="edge">
-<title>11&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M957.8675,-218.7845C954.8839,-217.7907 951.9043,-216.8494 949,-216 873.5413,-193.9312 782.9846,-176.7947 733.259,-168.1638"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="733.736,-164.6945 723.288,-166.4524 732.5518,-171.5936 733.736,-164.6945"/>
-</g>
-<!-- 12&#45;&gt;11 -->
-<g id="edge24" class="edge">
-<title>12&#45;&gt;11</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M997,-359.7623C997,-335.201 997,-291.2474 997,-262.3541"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="1000.5001,-262.0896 997,-252.0896 993.5001,-262.0897 1000.5001,-262.0896"/>
-</g>
-<!-- 14 -->
-<g id="node15" class="node">
-<title>14</title>
-<path fill="none" stroke="#d8ac56" stroke-width="2" d="M1155.5,-252C1155.5,-252 1066.5,-252 1066.5,-252 1060.5,-252 1054.5,-246 1054.5,-240 1054.5,-240 1054.5,-228 1054.5,-228 1054.5,-222 1060.5,-216 1066.5,-216 1066.5,-216 1155.5,-216 1155.5,-216 1161.5,-216 1167.5,-222 1167.5,-228 1167.5,-228 1167.5,-240 1167.5,-240 1167.5,-246 1161.5,-252 1155.5,-252"/>
-<text text-anchor="middle" x="1111" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">purge_genometools</text>
-</g>
-<!-- 12&#45;&gt;14 -->
-<g id="edge28" class="edge">
-<title>12&#45;&gt;14</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1011.4382,-359.7623C1031.3965,-334.5518 1067.5293,-288.9103 1090.3413,-260.0952"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="1093.2162,-262.1026 1096.6791,-252.0896 1087.7278,-257.7576 1093.2162,-262.1026"/>
-</g>
-<!-- 12&#45;&gt;15 -->
-<g id="edge30" class="edge">
-<title>12&#45;&gt;15</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M985.6014,-359.7623C969.9802,-334.7682 941.8077,-289.6924 923.7755,-260.8409"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="926.574,-258.7146 918.306,-252.0896 920.638,-262.4246 926.574,-258.7146"/>
-</g>
-<!-- 16 -->
-<g id="node17" class="node">
-<title>16</title>
-<path fill="none" stroke="#d88d56" stroke-width="2" d="M1296,-252C1296,-252 1198,-252 1198,-252 1192,-252 1186,-246 1186,-240 1186,-240 1186,-228 1186,-228 1186,-222 1192,-216 1198,-216 1198,-216 1296,-216 1296,-216 1302,-216 1308,-222 1308,-228 1308,-228 1308,-240 1308,-240 1308,-246 1302,-252 1296,-252"/>
-<text text-anchor="middle" x="1247" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">purge_find_telomeres</text>
-</g>
-<!-- 12&#45;&gt;16 -->
-<g id="edge31" class="edge">
-<title>12&#45;&gt;16</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1028.4095,-359.9081C1073.5952,-333.8812 1156.9584,-285.8639 1206.652,-257.2404"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="1208.6039,-260.1553 1215.5223,-252.1311 1205.11,-254.0896 1208.6039,-260.1553"/>
-</g>
-<!-- 18 -->
-<g id="node19" class="node">
-<title>18</title>
-<path fill="none" stroke="#8fd856" stroke-width="2" d="M1279.5,-180C1279.5,-180 1216.5,-180 1216.5,-180 1210.5,-180 1204.5,-174 1204.5,-168 1204.5,-168 1204.5,-156 1204.5,-156 1204.5,-150 1210.5,-144 1216.5,-144 1216.5,-144 1279.5,-144 1279.5,-144 1285.5,-144 1291.5,-150 1291.5,-156 1291.5,-156 1291.5,-168 1291.5,-168 1291.5,-174 1285.5,-180 1279.5,-180"/>
-<text text-anchor="middle" x="1248" y="-159.5" font-family="sans" font-size="10.00" fill="#000000">link_final_asm</text>
-</g>
-<!-- 12&#45;&gt;18 -->
-<g id="edge33" class="edge">
-<title>12&#45;&gt;18</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1033.7241,-369.0603C1110.5371,-349.5387 1283.8146,-300.9154 1317,-252 1325.9827,-238.7595 1324.008,-230.3836 1317,-216 1311.2696,-204.2386 1301.6013,-194.2849 1291.2879,-186.2518"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="1293.0974,-183.2412 1282.9415,-180.2233 1288.9987,-188.9159 1293.0974,-183.2412"/>
-</g>
-<!-- 26 -->
-<g id="node27" class="node">
-<title>26</title>
-<path fill="none" stroke="#56d882" stroke-width="2" d="M916.5,-324C916.5,-324 879.5,-324 879.5,-324 873.5,-324 867.5,-318 867.5,-312 867.5,-312 867.5,-300 867.5,-300 867.5,-294 873.5,-288 879.5,-288 879.5,-288 916.5,-288 916.5,-288 922.5,-288 928.5,-294 928.5,-300 928.5,-300 928.5,-312 928.5,-312 928.5,-318 922.5,-324 916.5,-324"/>
-<text text-anchor="middle" x="898" y="-303.5" font-family="sans" font-size="10.00" fill="#000000">purge_cp</text>
-</g>
-<!-- 12&#45;&gt;26 -->
-<g id="edge53" class="edge">
-<title>12&#45;&gt;26</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M972.0181,-359.8314C959.7072,-350.8779 944.6893,-339.9558 931.4063,-330.2955"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="933.1199,-327.214 922.9739,-324.1628 929.0027,-332.8752 933.1199,-327.214"/>
-</g>
-<!-- 13&#45;&gt;12 -->
-<g id="edge25" class="edge">
-<title>13&#45;&gt;12</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M997,-431.8314C997,-424.131 997,-414.9743 997,-406.4166"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="1000.5001,-406.4132 997,-396.4133 993.5001,-406.4133 1000.5001,-406.4132"/>
-</g>
-<!-- 17 -->
-<g id="node18" class="node">
-<title>17</title>
-<path fill="none" stroke="#56c9d8" stroke-width="2" d="M1299,-108C1299,-108 1247,-108 1247,-108 1241,-108 1235,-102 1235,-96 1235,-96 1235,-84 1235,-84 1235,-78 1241,-72 1247,-72 1247,-72 1299,-72 1299,-72 1305,-72 1311,-78 1311,-84 1311,-84 1311,-96 1311,-96 1311,-102 1305,-108 1299,-108"/>
-<text text-anchor="middle" x="1273" y="-87.5" font-family="sans" font-size="10.00" fill="#000000">cutoffs_eval</text>
-</g>
-<!-- 13&#45;&gt;17 -->
-<g id="edge32" class="edge">
-<title>13&#45;&gt;17</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1052.6276,-440.7111C1157.4965,-421.6152 1374,-373.9615 1374,-306 1374,-306 1374,-306 1374,-234 1374,-184.9451 1334.2284,-140.7517 1304.5999,-114.6353"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="1306.809,-111.9191 1296.9356,-108.0763 1302.2576,-117.2375 1306.809,-111.9191"/>
-</g>
-<!-- 14&#45;&gt;0 -->
-<g id="edge12" class="edge">
-<title>14&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1083.8769,-215.8627C1038.2536,-185.4814 943.0854,-122.6518 861,-72 842.7811,-60.7578 822.2576,-48.6583 805.2255,-38.7678"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="806.5672,-35.5004 796.1591,-33.5198 803.0604,-41.5587 806.5672,-35.5004"/>
-</g>
-<!-- 14&#45;&gt;20 -->
-<g id="edge43" class="edge">
-<title>14&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1054.2177,-218.0772C1051.1077,-217.3417 1048.0202,-216.6434 1045,-216 932.5289,-192.0416 797.5267,-174.2035 733.3801,-166.378"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="733.6896,-162.89 723.3419,-165.1648 732.8496,-169.8394 733.6896,-162.89"/>
-</g>
-<!-- 15&#45;&gt;0 -->
-<g id="edge2" class="edge">
-<title>15&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M895.4716,-215.9555C871.1694,-177.9173 814.5803,-89.3431 786.0664,-44.7127"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="788.9261,-42.6877 780.5927,-36.1451 783.0272,-46.4564 788.9261,-42.6877"/>
-</g>
-<!-- 15&#45;&gt;20 -->
-<g id="edge36" class="edge">
-<title>15&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M874.455,-219.3397C871.6109,-218.1719 868.764,-217.0424 866,-216 820.7722,-198.9437 767.4965,-182.6111 732.8015,-172.4609"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="733.6753,-169.07 723.0959,-169.6409 731.7222,-175.7921 733.6753,-169.07"/>
-</g>
-<!-- 16&#45;&gt;0 -->
-<g id="edge8" class="edge">
-<title>16&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1210.28,-215.968C1147.5514,-185.3424 1015.519,-121.6554 902,-72 869.6675,-57.8572 832.4089,-42.8375 805.5673,-32.2387"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="806.7651,-28.9488 796.178,-28.5436 804.2017,-35.4626 806.7651,-28.9488"/>
-</g>
-<!-- 16&#45;&gt;20 -->
-<g id="edge40" class="edge">
-<title>16&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1185.6478,-217.7061C1182.7321,-217.0948 1179.8385,-216.5216 1177,-216 1013.073,-185.8748 814.4224,-170.015 733.2586,-164.3992"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="733.2652,-160.8917 723.0506,-163.7044 732.7898,-167.8755 733.2652,-160.8917"/>
-</g>
-<!-- 17&#45;&gt;0 -->
-<g id="edge3" class="edge">
-<title>17&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1234.7596,-80.8678C1221.0144,-77.7812 1205.3608,-74.4899 1191,-72 1049.844,-47.5257 880.336,-29.1392 806.4687,-21.6649"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="806.5787,-18.1585 796.279,-20.6419 805.8793,-25.1234 806.5787,-18.1585"/>
-</g>
-<!-- 18&#45;&gt;0 -->
-<g id="edge5" class="edge">
-<title>18&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M1221.3691,-143.9882C1189.5927,-123.3636 1134.1893,-90.0996 1082,-72 985.9157,-38.6774 865.8257,-25.3411 806.1861,-20.4886"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="806.3573,-16.9914 796.1156,-19.7038 805.8134,-23.9703 806.3573,-16.9914"/>
-</g>
-<!-- 19 -->
-<g id="node20" class="node">
-<title>19</title>
-<path fill="none" stroke="#56d8a2" stroke-width="2" d="M728.5,-108C728.5,-108 663.5,-108 663.5,-108 657.5,-108 651.5,-102 651.5,-96 651.5,-96 651.5,-84 651.5,-84 651.5,-78 657.5,-72 663.5,-72 663.5,-72 728.5,-72 728.5,-72 734.5,-72 740.5,-78 740.5,-84 740.5,-84 740.5,-96 740.5,-96 740.5,-102 734.5,-108 728.5,-108"/>
-<text text-anchor="middle" x="696" y="-87.5" font-family="sans" font-size="10.00" fill="#000000">rename_report</text>
-</g>
-<!-- 19&#45;&gt;0 -->
-<g id="edge10" class="edge">
-<title>19&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M714.421,-71.8314C723.1529,-63.219 733.7317,-52.7851 743.2423,-43.4048"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="745.923,-45.6769 750.5849,-36.1628 741.0075,-40.6931 745.923,-45.6769"/>
-</g>
-<!-- 20&#45;&gt;19 -->
-<g id="edge34" class="edge">
-<title>20&#45;&gt;19</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M696,-143.8314C696,-136.131 696,-126.9743 696,-118.4166"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="699.5001,-118.4132 696,-108.4133 692.5001,-118.4133 699.5001,-118.4132"/>
-</g>
-<!-- 21 -->
-<g id="node22" class="node">
-<title>21</title>
-<path fill="none" stroke="#d88556" stroke-width="2" d="M578,-324C578,-324 544,-324 544,-324 538,-324 532,-318 532,-312 532,-312 532,-300 532,-300 532,-294 538,-288 544,-288 544,-288 578,-288 578,-288 584,-288 590,-294 590,-300 590,-300 590,-312 590,-312 590,-318 584,-324 578,-324"/>
-<text text-anchor="middle" x="561" y="-303.5" font-family="sans" font-size="10.00" fill="#000000">merqury</text>
-</g>
-<!-- 21&#45;&gt;0 -->
-<g id="edge6" class="edge">
-<title>21&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M560.5729,-287.5068C560.3533,-277.08 560.1094,-263.814 560,-252 559.2193,-167.6762 558.984,-130.2079 620,-72 650.7188,-42.6949 698.5374,-29.0986 731.6826,-22.9208"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="732.6263,-26.31 741.8881,-21.1652 731.4395,-19.4113 732.6263,-26.31"/>
-</g>
-<!-- 21&#45;&gt;20 -->
-<g id="edge38" class="edge">
-<title>21&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M560.5269,-287.7828C560.9552,-268.0437 564.4916,-236.6148 581,-216 600.4744,-191.6814 633.4687,-177.7591 659.1348,-170.1642"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="660.2107,-173.4982 668.9196,-167.4646 658.3489,-166.7503 660.2107,-173.4982"/>
-</g>
-<!-- 22 -->
-<g id="node23" class="node">
-<title>22</title>
-<path fill="none" stroke="#56a2d8" stroke-width="2" d="M751,-396C751,-396 721,-396 721,-396 715,-396 709,-390 709,-384 709,-384 709,-372 709,-372 709,-366 715,-360 721,-360 721,-360 751,-360 751,-360 757,-360 763,-366 763,-372 763,-372 763,-384 763,-384 763,-390 757,-396 751,-396"/>
-<text text-anchor="middle" x="736" y="-375.5" font-family="sans" font-size="10.00" fill="#000000">meryl</text>
-</g>
-<!-- 22&#45;&gt;21 -->
-<g id="edge48" class="edge">
-<title>22&#45;&gt;21</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M708.6563,-366.75C679.3459,-354.6909 632.5808,-335.4504 599.5445,-321.8583"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="600.7721,-318.5788 590.1925,-318.0106 598.1087,-325.0523 600.7721,-318.5788"/>
-</g>
-<!-- 25 -->
-<g id="node26" class="node">
-<title>25</title>
-<path fill="none" stroke="#d89c56" stroke-width="2" d="M834.5,-324C834.5,-324 763.5,-324 763.5,-324 757.5,-324 751.5,-318 751.5,-312 751.5,-312 751.5,-300 751.5,-300 751.5,-294 757.5,-288 763.5,-288 763.5,-288 834.5,-288 834.5,-288 840.5,-288 846.5,-294 846.5,-300 846.5,-300 846.5,-312 846.5,-312 846.5,-318 840.5,-324 834.5,-324"/>
-<text text-anchor="middle" x="799" y="-303.5" font-family="sans" font-size="10.00" fill="#000000">purge_cp_meryl</text>
-</g>
-<!-- 22&#45;&gt;25 -->
-<g id="edge52" class="edge">
-<title>22&#45;&gt;25</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M751.8976,-359.8314C759.2278,-351.454 768.066,-341.3531 776.0969,-332.1749"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="778.9373,-334.2438 782.8884,-324.4133 773.6693,-329.6343 778.9373,-334.2438"/>
-</g>
-<!-- 23&#45;&gt;21 -->
-<g id="edge47" class="edge">
-<title>23&#45;&gt;21</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M561.8733,-431.7623C561.7028,-407.201 561.3976,-363.2474 561.1969,-334.3541"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="564.6951,-334.065 561.1256,-324.0896 557.6952,-334.1137 564.6951,-334.065"/>
-</g>
-<!-- 24 -->
-<g id="node25" class="node">
-<title>24</title>
-<path fill="none" stroke="#70d856" stroke-width="2" d="M844.5,-252C844.5,-252 777.5,-252 777.5,-252 771.5,-252 765.5,-246 765.5,-240 765.5,-240 765.5,-228 765.5,-228 765.5,-222 771.5,-216 777.5,-216 777.5,-216 844.5,-216 844.5,-216 850.5,-216 856.5,-222 856.5,-228 856.5,-228 856.5,-240 856.5,-240 856.5,-246 850.5,-252 844.5,-252"/>
-<text text-anchor="middle" x="811" y="-231.5" font-family="sans" font-size="10.00" fill="#000000">purge_merqury</text>
-</g>
-<!-- 24&#45;&gt;0 -->
-<g id="edge9" class="edge">
-<title>24&#45;&gt;0</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M808.3748,-215.927C803.9016,-185.8988 794.1919,-123.895 783,-72 781.1911,-63.6126 778.9576,-54.5632 776.8099,-46.3145"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="780.1269,-45.1686 774.1738,-36.4044 773.3621,-46.9681 780.1269,-45.1686"/>
-</g>
-<!-- 24&#45;&gt;20 -->
-<g id="edge41" class="edge">
-<title>24&#45;&gt;20</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M781.9806,-215.8314C766.71,-206.2706 747.8539,-194.4651 731.6949,-184.3481"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="733.3682,-181.2664 723.0351,-178.9263 729.6536,-187.1995 733.3682,-181.2664"/>
-</g>
-<!-- 25&#45;&gt;24 -->
-<g id="edge51" class="edge">
-<title>25&#45;&gt;24</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M802.0281,-287.8314C803.3115,-280.131 804.8376,-270.9743 806.2639,-262.4166"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="809.7394,-262.8526 807.9311,-252.4133 802.8347,-261.7018 809.7394,-262.8526"/>
-</g>
-<!-- 26&#45;&gt;24 -->
-<g id="edge50" class="edge">
-<title>26&#45;&gt;24</title>
-<path fill="none" stroke="#c0c0c0" stroke-width="2" d="M876.0462,-287.8314C865.3305,-278.9632 852.2811,-268.1637 840.691,-258.5718"/>
-<polygon fill="#c0c0c0" stroke="#c0c0c0" stroke-width="2" points="842.8822,-255.8422 832.9467,-252.1628 838.4192,-261.2349 842.8822,-255.8422"/>
-</g>
-</g>
-</svg>
diff --git a/workflow/documentation.md b/workflow/documentation.md
index a698a4c..52fac70 100644
--- a/workflow/documentation.md
+++ b/workflow/documentation.md
@@ -1,26 +1,31 @@
+# <A HREF="https://forgemia.inra.fr/asm4pg/GenomAsm4pg"> asm4pg </A>
+
 Asm4pg is an automatic and reproducible genome assembly workflow for pangenomic applications using PacBio HiFi data.
 
 [TOC]
 
-# Asm4pg Requirements
+![workflow DAG](doc/fig/rule_dag.svg)
+
+## Asm4pg Requirements
 - snakemake >= 6.5.1
 - singularity
 
 The workflow does not work with HPC that does not allow a job to run other jobs.
 
-# Tutorials
+## Tutorials
 The three assembly modes from hifiasm are available.
 - [Quick start (default mode)](Quick-start)
 - [Hi-C mode](doc/Assembly-Mode/Hi-C-tutorial)
 - [Trio mode](doc/Assembly-Mode/Trio-tutorial)
 
-# Outputs
+## Outputs
 [Workflow outputs](doc/Outputs)
 
-# Optional Data Preparation
+## Optional Data Preparation
 If your [data is in a tarball](doc/Tar-data-preparation)
-# Known errors
+
+## Known errors
 You may run into [these errors](doc/Known-errors)
 
-# Programs
-[Programs used in the workflow](doc/Programs)
+## Softwares
+[Softwares used in the workflow](doc/Programs)
-- 
GitLab


From 7712af5c738b6604a56895b5152cd69175605ee2 Mon Sep 17 00:00:00 2001
From: Joseph Tran <joseph.tran@inrae.fr>
Date: Mon, 16 Jan 2023 15:44:40 +0100
Subject: [PATCH 5/7] update pages links

---
 SUMMARY.md                                  |  1 +
 workflow/doc/Assembly-Mode/Hi-C-tutorial.md | 16 +++++++++-------
 workflow/doc/Assembly-Mode/Trio-tutorial.md | 16 +++++++++-------
 workflow/doc/Going-further.md               | 16 +++++++++-------
 workflow/doc/Known-errors.md                |  8 +++++---
 workflow/doc/Outputs.md                     |  6 ++++--
 workflow/doc/Quick-start.md                 | 20 +++++++++++---------
 workflow/doc/Tar-data-preparation.md        | 10 ++++++----
 workflow/documentation.md                   | 16 ++++++++--------
 9 files changed, 62 insertions(+), 47 deletions(-)

diff --git a/SUMMARY.md b/SUMMARY.md
index 7023558..5972000 100644
--- a/SUMMARY.md
+++ b/SUMMARY.md
@@ -11,6 +11,7 @@
         * [Workflow output](workflow/doc/Outputs.md)
     * [Optional data preparation](workflow/documentation.md#optional-data-preparation)
         * [if your data is in a tarball archive](workflow/doc/Tar-data-preparation.md)
+    * [Going further](workflow/doc/Going-further.md)
     * [Troubleshooting](workflow/documentation.md#known-errors)
         * [known errors](workflow/doc/Known-errors.md)
     * [Software Dependencies](workflow/documentation.md#programs)
diff --git a/workflow/doc/Assembly-Mode/Hi-C-tutorial.md b/workflow/doc/Assembly-Mode/Hi-C-tutorial.md
index 154edf8..95ac825 100644
--- a/workflow/doc/Assembly-Mode/Hi-C-tutorial.md
+++ b/workflow/doc/Assembly-Mode/Hi-C-tutorial.md
@@ -1,8 +1,10 @@
-Please look at [quick start](../Quick-start) first, some of the steps are omitted here.
+# Hi-C mode tutorial
+
+Please look at [quick start](../Quick-start.md) first, some of the steps are omitted here.
 
 This tutorial shows how to use the workflow with hi-c assembly mode which takes PacBio Hifi data and Hi-C data as input.
 
-# 1. Config file
+## 1. Config file
 **TO-DO : add a toy dataset fasta and hi-c.**
 ```bash
 cd GenomAsm4pg/.config
@@ -26,19 +28,19 @@ toy_dataset_hi-c:
   r2: ./GenomAsm4pg/tutorial_data/hi-c/data_r1.fasta
 ```
 
-# 2. Dry run
+## 2. Dry run
 To check the config, first do a dry run of the workflow.
 
 ```bash
 sbatch job.sh dry
 ```
-# 3. Run 
+## 3. Run 
 If the dry run is successful, you can run the workflow.
 
 ```bash
 sbatch job.sh
 ```
 
-# Other assembly modes
-If you want to use parental data, follow the [Trio assembly mode tutorial](../Assembly-Mode/Trio-tutorial).
-To go further with the workflow use go [here](../Going-further).
+## Other assembly modes
+If you want to use parental data, follow the [Trio assembly mode tutorial](Trio-tutorial.md).
+To go further with the workflow use go [here](../Going-further.md).
diff --git a/workflow/doc/Assembly-Mode/Trio-tutorial.md b/workflow/doc/Assembly-Mode/Trio-tutorial.md
index 3364731..027f4bf 100644
--- a/workflow/doc/Assembly-Mode/Trio-tutorial.md
+++ b/workflow/doc/Assembly-Mode/Trio-tutorial.md
@@ -1,8 +1,10 @@
-Please look at [quick start](../../Quick-start) first, some of the steps are omitted here.
+# Trio mode tutorial
+
+Please look at [quick start](../Quick-start.md) first, some of the steps are omitted here.
 
 This tutorial shows how to use the workflow with hi-c assembly mode which takes PacBio Hifi data and Hi-C data as input.
 
-# 1. Config file
+## 1. Config file
 **TO-DO : add a toy dataset fasta and parental fasta.**
 ```bash
 cd GenomAsm4pg/.config
@@ -27,19 +29,19 @@ toy_dataset_trio:
   p2: ./GenomAsm4pg/tutorial_data/trio/data_p2.fasta
 ```
 
-# 2. Dry run
+## 2. Dry run
 To check the config, first do a dry run of the workflow.
 
 ```bash
 sbatch job.sh dry
 ```
-# 3. Run 
+## 3. Run 
 If the dry run is successful, you can run the workflow.
 
 ```bash
 sbatch job.sh
 ```
 
-# Other assembly modes
-If you want to use Hi-C data, follow the [Hi-C assembly mode tutorial](doc/Assembly-Mode/Hi-C-tutorial).
-To go further with the workflow use go [here](doc/Going-further).
+## Other assembly modes
+If you want to use Hi-C data, follow the [Hi-C assembly mode tutorial](Hi-C-tutorial.md).
+To go further with the workflow use go [here](../Going-further.md).
diff --git a/workflow/doc/Going-further.md b/workflow/doc/Going-further.md
index e5e6cd4..7d27767 100644
--- a/workflow/doc/Going-further.md
+++ b/workflow/doc/Going-further.md
@@ -1,9 +1,11 @@
+# Going further
+
 [TOC]
 
-# 1. Multiple datasets
+## 1. Multiple datasets
 You can run the workflow on multiple datasets at the same time.
 
-## 1.1. All datasets
+### 1.1. All datasets
 With `masterconfig.yaml` as follow, running the workflow will assemble each dataset in its specific assembly mode.
 You can add as many datasets as you want, each with different parameters. 
 
@@ -35,7 +37,7 @@ toy_dataset_trio:
   p1: ./GenomAsm4pg/tutorial_data/trio/data_p1.fasta
   p2: ./GenomAsm4pg/tutorial_data/trio/data_p2.fasta
 ```
-## 1.2. On chosen datasets
+### 1.2. On chosen datasets
 You can remove dataset from IDS to assemble only chosen genomes:
 ```yaml
 IDS: ["toy_dataset", "toy_dataset_trio"]
@@ -67,7 +69,7 @@ toy_dataset_trio:
 ```
 Running the workflow with this config will assemble only `toy_dataset` and `toy_dataset_trio`.
 
-# 2. Different run names
+## 2. Different run names
 If you want to try different parameters on the same dataset, changing the run name will create a new directory and keep the previous data.
 
 In the [Hi-C tutorial](), we used the following config.
@@ -97,7 +99,7 @@ toy_dataset_hi-c:
 ```
 You will end up with 2 sub-directories for toy_dataset_hi-c (`hi-c_tutorial` and `default_comparaison`) and keep the data from the previous run in Hi-C mode.
 
-# 3. The same dataset with different parameters at once
+## 3. The same dataset with different parameters at once
 If you want to do the previous example in one run, you will have to create a symbolic link to the fasta with a different filename.
 
 YAML files do not allow multiple uses of the same key. The following config does not work.
@@ -122,7 +124,7 @@ toy_dataset_hi-c:
 
 **TO COMPLETE**
 
-# 4. Optional fastq and bam files
+## 4. Optional fastq and bam files
 If fastq and bam are available and you want to do raw QC with fastQC and longQC, add the `fastq` and/or `bam` key in your config. The fasta, fastq and bam filenames have to be the same. For example:
 
 ```yaml
@@ -136,4 +138,4 @@ toy_dataset:
   ploidy: 2
   busco_lineage: eudicots_odb10
   mode: default
-```
\ No newline at end of file
+```
diff --git a/workflow/doc/Known-errors.md b/workflow/doc/Known-errors.md
index ef83716..5cc2cc6 100644
--- a/workflow/doc/Known-errors.md
+++ b/workflow/doc/Known-errors.md
@@ -1,7 +1,9 @@
+# Troubleshooting
+
 [TOC]
 
-# One of the BUSCO rules failed
+## One of the BUSCO rules failed
 The first time you run the workflow, the BUSCO lineage might be downloaded multiple times. This can create a conflict between the jobs using BUSCO and may interrupt some of them. In that case, you only need to rerun the workflow once everything is done.
 
-# Snakemake locked directory
-When you try to rerun the workflow after cancelling a job, you may have to unlock the results directory. To do so, go in `.config/snakemake_profile/slurm` and uncomment line 14 of `config.yaml`. Run the workflow once to unlock the directory (it should only take a few seconds). Still in `config.yaml`, comment line 14. The workflow will be able to run and create outputs.
\ No newline at end of file
+## Snakemake locked directory
+When you try to rerun the workflow after cancelling a job, you may have to unlock the results directory. To do so, go in `.config/snakemake_profile/slurm` and uncomment line 14 of `config.yaml`. Run the workflow once to unlock the directory (it should only take a few seconds). Still in `config.yaml`, comment line 14. The workflow will be able to run and create outputs.
diff --git a/workflow/doc/Outputs.md b/workflow/doc/Outputs.md
index d27e5eb..e91b46a 100644
--- a/workflow/doc/Outputs.md
+++ b/workflow/doc/Outputs.md
@@ -1,6 +1,8 @@
+# Workflow output
+
 [TOC]
 
-# Directories
+## Directories
 There are three directories for the data produced by the workflow:
 - An automatic report is generated in the `RUN` directory.
 - `01_raw_data_QC` contains all quality control ran on the reads. FastQC and LongQC create HTML reports on fastq and bam files respectively, reads stats are given by Genometools, and predictions of genome size and heterozygosity are given by Genomescope (in directory `04_kmer`).
@@ -39,7 +41,7 @@ workflow_results
                     └── telomeres
 ```
 
-# Additional files
+## Additional files
 - Symbolic links to haplotype 1 and haplotype 2 assemblies after purge_dups
 - HTML report with the main results from each program
 - Runtime file with the total workflow runtime for the dataset
diff --git a/workflow/doc/Quick-start.md b/workflow/doc/Quick-start.md
index 29f6bc4..4db0da8 100644
--- a/workflow/doc/Quick-start.md
+++ b/workflow/doc/Quick-start.md
@@ -1,14 +1,16 @@
+# Quick start
+
 This tutorial shows how to use the workflow with default assembly mode which takes PacBio Hifi data as input.
 
 [TOC]
 
-# Clone repository
+## Clone repository
 ```bash
 cd .
 git clone https://forgemia.inra.fr/asm4pg/GenomAsm4pg.git
 ```
 
-# 1. Cluster profile setup
+## 1. Cluster profile setup
 ```bash
 cd GenomAsm4pg/.config/snakemake_profile
 ```
@@ -17,7 +19,7 @@ The current profile is made for SLURM. If you use it, change line 13 to your ema
 To run this workflow on another HPC, create another profile (https://github.com/Snakemake-Profiles) and add it in the `.config/snakemake_profile` directory. Change the `CLUSTER_CONFIG` and `PROFILE` variables in `job.sh` and `prejob.sh` scripts.
 
 
-# 2. Config file
+## 2. Config file
 **TO-DO : add a toy fasta.**
 ```bash
 cd ..
@@ -44,23 +46,23 @@ toy_dataset:
   mode: default
 ```
 
-# 3. Create slurm_logs directory
+## 3. Create slurm_logs directory
 ```bash
 cd ..
 mkdir slurm_logs
 ```
 SLURM logs for each rule will be in this directory, there are .out and .err files for the worklow (*snakemake.cortex**) and for each rules (*rulename.cortex**).
 
-# 4. Mail setup
+## 4. Mail setup
 Modify line 17 to your email address in `job.sh`.
 
-# 5. Dry run
+## 5. Dry run
 To check the config, first do a dry run of the workflow.
 
 ```bash
 sbatch job.sh dry
 ```
-# 6. Run 
+## 6. Run 
 If the dry run is successful, check that the `SNG_BIND` variable in `job.sh` is the same as `root` variable in `masterconfig.yaml`. 
 
 If Singularity is not in the HPC environment, add `module load singularity` under `module load snakemake/6.5.1`.
@@ -71,5 +73,5 @@ You can run the workflow.
 sbatch job.sh
 ```
 
-# Other assembly modes
-If you want to use additional Hi-C data or parental data, follow the [Hi-C assembly mode tutorial](Documentation/Assembly-Mode/Hi-C-tutorial) or the [Trio assembly mode tutorial](Assembly-Mode/Trio-tutorial.md). To go further with the workflow use go [here](Going-further.md).
+## Other assembly modes
+If you want to use additional Hi-C data or parental data, follow the [Hi-C assembly mode tutorial](Assembly-Mode/Hi-C-tutorial.md) or the [Trio assembly mode tutorial](Assembly-Mode/Trio-tutorial.md). To go further with the workflow use go [here](Going-further.md).
diff --git a/workflow/doc/Tar-data-preparation.md b/workflow/doc/Tar-data-preparation.md
index 6a92ec1..cc75afd 100644
--- a/workflow/doc/Tar-data-preparation.md
+++ b/workflow/doc/Tar-data-preparation.md
@@ -1,8 +1,10 @@
+# Optional: data preparation
+
 If your data is in a tarball, this companion workflow will extract the data and convert bam files to fastq and fasta if necessary.
 
 [TOC]
 
-# 1. Config file
+## 1. Config file
 ```bash
 cd GenomAsm4pg/.config
 ```
@@ -10,7 +12,7 @@ Modify the the `data` variable in file `.config/masterconfig.yaml` to be the pat
 This workflow can automatically determine the name of files in the specified `data` directory, or run only on given files :
 - `get_all_tar_filename: True` will uncompress all tar files. If you want to choose the the files to uncompress, use `get_all_tar_filename: False` and give the filenames as a list in `tarIDS`
 
-# 2. Run 
+## 2. Run 
 Modify the `SNG_BIND` variable in `prejob.sh`, it has to be the same as the variable `root` in `.config/masterconfig.yaml`. Change line 17 to your email adress.
 If Singularity is not in the HPC environement, add `module load singularity` under Module loading.
 
@@ -20,7 +22,7 @@ Then run
 sbatch prejob.sh
 ```
 
-# 3. Outputs
+## 3. Outputs
 This will create multiple directories to prepare the data for the workflow. You will end up with a `bam_files` directory containing all *bam* files, renamed as the tar filename if your data was named "ccs.bam", and a `fastx_files` directory containing all *fasta* and *fastq* files. The `extract` directory contains all other files that were in the tar ball.
 
 ```
@@ -29,4 +31,4 @@ workflow_results
 	├── bam_files
 	├── extract
 	└── fastx_files
-```
\ No newline at end of file
+```
diff --git a/workflow/documentation.md b/workflow/documentation.md
index 52fac70..8b20613 100644
--- a/workflow/documentation.md
+++ b/workflow/documentation.md
@@ -4,7 +4,7 @@ Asm4pg is an automatic and reproducible genome assembly workflow for pangenomic
 
 [TOC]
 
-![workflow DAG](doc/fig/rule_dag.svg)
+![workflow DAG](../workflow/doc/fig/rule_dag.svg)
 
 ## Asm4pg Requirements
 - snakemake >= 6.5.1
@@ -14,18 +14,18 @@ The workflow does not work with HPC that does not allow a job to run other jobs.
 
 ## Tutorials
 The three assembly modes from hifiasm are available.
-- [Quick start (default mode)](Quick-start)
-- [Hi-C mode](doc/Assembly-Mode/Hi-C-tutorial)
-- [Trio mode](doc/Assembly-Mode/Trio-tutorial)
+- [Quick start (default mode)](doc/Quick-start.md)
+- [Hi-C mode](doc/Assembly-Mode/Hi-C-tutorial.md)
+- [Trio mode](doc/Assembly-Mode/Trio-tutorial.md)
 
 ## Outputs
-[Workflow outputs](doc/Outputs)
+[Workflow outputs](doc/Outputs.md)
 
 ## Optional Data Preparation
-If your [data is in a tarball](doc/Tar-data-preparation)
+If your [data is in a tarball](doc/Tar-data-preparation.md)
 
 ## Known errors
-You may run into [these errors](doc/Known-errors)
+You may run into [these errors](doc/Known-errors.md)
 
 ## Softwares
-[Softwares used in the workflow](doc/Programs)
+[Softwares used in the workflow](doc/Programs.md)
-- 
GitLab


From 44b656c7230ad1b5ea71cb9728998d78edc319fb Mon Sep 17 00:00:00 2001
From: Joseph Tran <joseph.tran@inrae.fr>
Date: Mon, 16 Jan 2023 15:45:46 +0100
Subject: [PATCH 6/7] use node container lts version in cicd

---
 .gitlab-ci.yml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 1997dc0..07cf074 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -1,5 +1,5 @@
-# requiring the environment of NodeJS 10
-image: node:10
+# requiring the environment of NodeJS LTS
+image: node:lts
 
 # add 'node_modules' to cache for speeding up builds
 cache:
-- 
GitLab


From 4a28daa4eb7809f8f23dd18b033be7a6b5786b5e Mon Sep 17 00:00:00 2001
From: Joseph Tran <joseph.tran@inrae.fr>
Date: Mon, 16 Jan 2023 15:53:01 +0100
Subject: [PATCH 7/7] add gitlab pages doc link

---
 README.md                 | 2 ++
 workflow/documentation.md | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/README.md b/README.md
index 3ae5d70..cc5c293 100644
--- a/README.md
+++ b/README.md
@@ -5,6 +5,8 @@ This workflow uses [Snakemake](https://snakemake.readthedocs.io/en/stable/) to q
 
 A first script (```prejob.sh```) prepares the data until *fasta.gz* files are obtained. A second script (```job.sh```) runs the genome assembly and stats.
 
+doc: [Gitlab pages](https://asm4pg.pages.mia.inra.fr/genomasm4pg)
+
 ![workflow DAG](workflow/doc/fig/rule_dag.svg)
 
 ## Table of contents
diff --git a/workflow/documentation.md b/workflow/documentation.md
index 8b20613..dc71e19 100644
--- a/workflow/documentation.md
+++ b/workflow/documentation.md
@@ -2,6 +2,8 @@
 
 Asm4pg is an automatic and reproducible genome assembly workflow for pangenomic applications using PacBio HiFi data.
 
+doc: [Gitlab pages](https://asm4pg.pages.mia.inra.fr/genomasm4pg)
+
 [TOC]
 
 ![workflow DAG](../workflow/doc/fig/rule_dag.svg)
-- 
GitLab