From 52f6730f2c32561c1cbc01a91fbd4149b0e90e55 Mon Sep 17 00:00:00 2001 From: pacnpal <183241239+pacnpal@users.noreply.github.com> Date: Mon, 9 Dec 2024 20:28:31 -0500 Subject: [PATCH] Update README.md --- README.md | 221 +++++++++++++----------------------------------------- 1 file changed, 54 insertions(+), 167 deletions(-) diff --git a/README.md b/README.md index 19e79d3..bd04521 100644 --- a/README.md +++ b/README.md @@ -1,191 +1,78 @@ # README.md -# Documentation to PDF Converter -A Python script that clones documentation from a Git repository (default: Next.js), processes it, and generates a well-formatted PDF with table of contents, proper formatting, and consistent styling. +## Export Nextjs Docs Script + +The script automates the process of cloning documentation repositories, converting Markdown files to HTML, and generating PDF files. This README covers installation and usage. + +--- ## Features -- Clones specific documentation directories from Git repositories -- Processes Markdown and MDX files -- Generates table of contents with proper numbering -- Handles code blocks with filename annotations -- Processes frontmatter for metadata -- Supports image path transformations -- Creates PDF with customizable headers and footers -- Includes cover page and proper page breaks +- Clone remote repositories with sparse checkout. +- Convert Markdown files to HTML with code block and image path preprocessing. +- Generate PDFs with custom headers, footers, and styles. +- Automatically create a hierarchical Table of Contents (ToC). +- Detect the latest version of the documentation. +- Handles YAML frontmatter for metadata-rich documentation. -## Requirements +--- -### System Requirements -- Python 3.7+ -- Git installed and accessible from command line -- Internet connection for cloning repositories +## Installation -### Python Dependencies -Install all required packages using: -```bash -pip install -r requirements.txt -``` +### Prerequisites -Then install Playwright's browser: -```bash -playwright install chromium -``` +- Python 3.8+ +- Required Python packages: + - `markdown` + - `yaml` + - `tqdm` + - `playwright` + - `gitpython` +- Ensure you have Playwright installed and configured: + ```bash + pip install playwright + playwright install + ``` -## Setup - -1. Clone this repository: -```bash -git clone -cd -``` - -2. Install dependencies: -```bash -pip install -r requirements.txt -playwright install chromium -``` - -3. Ensure you have a `styles.css` file in the same directory as the script. This file should contain your desired CSS styling for the PDF output. +### Clone the Repository and Install +Clone the project repository, create a virtual environment, activate it, and install requirements. +--- + ```bash + git clone https://github.com/pacnpal/Docs-Exporter.git + cd Docs-Exporter + python -m venv .venv + source .venv/bin/activate + pip install -r requirements.txt + playwright install + ``` ## Usage -1. Basic usage with default settings (Next.js documentation): +### 1. Clone and Update Nextjs Repository +Run the script to clone or update the remote documentation repository: ```bash -python docs_to_pdf.py +python export-docs.py ``` -2. The script will: - - Clone/update the specified Git repository - - Process all documentation files - - Generate a PDF with proper formatting - - Include a cover page and table of contents +### 2. Convert Markdown to HTML +The script automatically processes `.md` and `.mdx` files, converting them to styled HTML. -## Configuration - -You can modify these variables in the script for different configurations: +### 3. Generate PDF +A PDF is created with a generated title and ToC. Ensure no other process is using the output file. +### Example Configuration ```python -repo_dir = "nextjs-docs" # Local directory for cloned repo -repo_url = "https://github.com/vercel/next.js.git" # Repository URL -branch = "canary" # Branch to clone -docs_dir = "docs" # Directory containing documentation - -# Image URL transformation settings -Change_img_url = True -base_path = "https://nextjs.org/_next/image?url=" -path_args = "&w=1920&q=75" +repo_url = "https://github.com/vercel/next.js.git" +branch = "canary" +docs_dir = "docs" ``` -## PDF Output Settings +### Output +- PDF file: `Next.js_Docs_vXX.XX.X_YYYY-MM-DD.pdf` +- Logs: Process information printed to the terminal. -The PDF generation includes: -- A4 format -- Custom margins -- Page numbers in header -- Generation date in footer -- Background colors/images -- Proper page breaks between sections +--- -## File Organization +## LICENSE -- `docs_to_pdf.py`: Main script file -- `requirements.txt`: Python dependencies -- `styles.css`: CSS styling for PDF output -- `README.md`: This documentation - -## Troubleshooting - -1. If the PDF file is locked: - - Ensure the output PDF is not open in any application - - Check file permissions - -2. If images are not loading: - - Verify internet connection - - Check if image URLs are accessible - - Adjust the `wait_for_load_state` timing if needed - -3. If the repository won't clone: - - Verify Git is installed and accessible - - Check internet connection - - Ensure you have access to the repository - -## Notes - -- The script creates temporary files during processing -- Large documentation sets may take several minutes to process -- Memory usage depends on the size of the documentation -- The script requires active internet connection for repository cloning and image processing - -## CSS Recommendations - -Your `styles.css` should include at least these basic styles for proper PDF formatting: - -```css -body { - font-family: Arial, sans-serif; - line-height: 1.6; - margin: 0; - padding: 20px; -} - -.master-container { - display: flex; - justify-content: center; - align-items: center; - min-height: 100vh; -} - -.container { - text-align: center; -} - -.title { - font-size: 24px; - font-weight: bold; - margin-bottom: 20px; -} - -.date { - font-size: 16px; -} - -.page-break { - page-break-after: always; -} - -code { - background-color: #f4f4f4; - padding: 2px 4px; - border-radius: 4px; -} - -pre { - background-color: #f8f8f8; - padding: 15px; - border-radius: 5px; - overflow-x: auto; -} - -.code-header { - background-color: #e0e0e0; - padding: 5px 10px; - border-radius: 5px 5px 0 0; -} - -table { - border-collapse: collapse; - width: 100%; - margin: 15px 0; -} - -th, td { - border: 1px solid #ddd; - padding: 8px; - text-align: left; -} - -th { - background-color: #f5f5f5; -} -``` \ No newline at end of file +This project is governed by the [LICENSE](LICENSE) file. Please ensure compliance when redistributing or modifying the script. \ No newline at end of file