Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
access-mongo.py	access-mongo.py
pyMongo_cursor_prompts.md	pyMongo_cursor_prompts.md

Python and MongoDB Integration Project

Overview

This directory contains Python-based examples and tutorials for working with MongoDB databases, focusing on environmental data analysis and noise mapping. The project demonstrates comprehensive database operations, data processing workflows, and integration between Python applications and MongoDB databases.

Key Features

MongoDB Integration: Complete Python-MongoDB connectivity and operations
Environmental Data Analysis: Specialised noise mapping and environmental data processing
Data Pipeline Development: End-to-end data processing workflows
Advanced Querying: Complex database queries and aggregation pipelines
Data Visualisation: Interactive charts and graphs for data insights
Error Handling: Robust error handling and logging for production environments

Prerequisites

Before running this project, ensure you have the following installed:

Python 3.7+: The project uses Python 3.13
MongoDB: A local MongoDB instance running on the default port (27017)
Git: For cloning the repository

Installation

Clone the repository:

git clone <repository-url>
cd test-mongo

Create a virtual environment:
```
python3 -m venv venv
```
Activate the virtual environment:
- On macOS/Linux:
```
source venv/bin/activate
```
- On Windows:
```
venv\Scripts\activate
```
Install required packages:
```
pip install pymongo pandas
```

Project Structure

This directory contains 2 main files for MongoDB integration:

Python/
├── README.md                 # This documentation file
├── access-mongo.py           # Main MongoDB integration script
├── pyMongo_cursor_prompts.md # MongoDB cursor and query examples
└── venv/                     # Virtual environment (created during setup)

📋 Important Note: The script expects a data/ directory containing noise_mapping_round_3.csv. This file is not included in the repository. You can download it from the UK Government Data Portal and place it in a data/ subdirectory within the Python/ folder.

Data Sources and Formats

Environmental Noise Mapping Data

The project primarily works with environmental noise mapping data that includes the following fields:

Location/Agglomeration: Geographic location name
Road_Pop_Lden>=55dB: Population exposed to road noise at 55dB or higher
Road_Pop_Lnight>=50dB: Population exposed to road noise at night at 50dB or higher
Railways_Pop_Lden>=55dB: Population exposed to railway noise at 55dB or higher
Industry_Pop_Lden>=55dB: Population exposed to industrial noise at 55dB or higher
AgglomerationPopulation: Total population of the agglomeration

Supported Data Formats

CSV Files: Primary data source format
JSON Documents: MongoDB document structure
API Data: Real-time environmental data feeds
Database Exports: Existing MongoDB collections

Usage

1. Prepare Data

Before loading data, you need to download the CSV file:

Download the dataset: Visit the UK Government Data Portal and download the noise mapping data
Create data directory: Create a data/ folder within the Python/ directory
Place CSV file: Save the downloaded file as noise_mapping_round_3.csv in the data/ folder

2. Loading Data

To load the CSV data into MongoDB:

# In access-mongo.py, uncomment this line:
load_csv_to_mongodb()

This will:

Connect to MongoDB
Clear any existing data in the noise_mapping collection
Load data from data/noise_mapping_round_3.csv
Insert all records into the collection

3. Running Queries

The main script demonstrates querying for documents where the location name starts with 'M':

python access-mongo.py

This will:

Display the structure of documents in the collection
Show the total number of documents
Query for documents where Location/Agglomeration starts with 'M'
Display the matching documents

4. Customising Queries

You can modify the query in the main() function to search for different criteria:

# Example: Find documents where location starts with 'L'
query_filter = {"Location/Agglomeration": {"$regex": r"^L", "$options": "i"}}

# Example: Find documents with population over 100,000
query_filter = {"AgglomerationPopulation": {"$gt": 100000}}

Database Configuration

Database Name: environmental
Collection Name: noise_mapping
MongoDB Connection: mongodb://localhost:27017/

Error Handling

The script includes comprehensive error handling for:

MongoDB connection failures
Missing CSV files
Data loading errors
Query execution errors

Educational Objectives

This project is designed for educational purposes and demonstrates:

Database Operations: MongoDB connectivity, CRUD operations, and advanced querying
Python Programming: Professional Python development with database integration
Data Processing: ETL workflows, data transformation, and format conversion
Error Handling: Production-grade error handling and logging practices
Code Documentation: Comprehensive documentation following UK spelling conventions
Best Practices: Industry-standard database programming techniques

Troubleshooting

Common Issues

MongoDB Connection Failed
- Ensure MongoDB is running on your system
- Check that it's accessible on port 27017
- Verify your MongoDB installation
CSV File Not Found
- Download the CSV file from the UK Government Data Portal
- Create a data/ directory in the Python/ folder
- Place the downloaded file as noise_mapping_round_3.csv in the data/ directory
- Verify the file path is correct
Virtual Environment Issues
- Make sure the virtual environment is activated
- Reinstall packages if needed: pip install -r requirements.txt

Getting Help

If you encounter issues:

Check the error messages in the terminal output
Verify all prerequisites are installed
Ensure MongoDB is running
Check that the virtual environment is activated

Advanced Features

MongoDB Aggregation Pipelines

Complex data transformation workflows
Multi-stage data processing
Performance optimisation techniques
Advanced query patterns

Data Visualisation

Interactive charts and graphs
Geographic data mapping
Real-time data dashboards
Custom visualisation components

API Integration

RESTful API development
Real-time data streaming
External service integration
Data synchronisation workflows

Contributing

This is an educational project designed for students at MK:U. Contributions that enhance learning outcomes are welcome, including:

Additional database operation examples
Enhanced error handling and logging
New data processing workflows
Improved documentation and tutorials
Performance optimisation techniques

License

This project is for educational purposes. Please ensure you have appropriate permissions for any data used.

Author

S. Hallett
Course: MKU, Big Data and Visualisation
Date: 18/06/2025

This README uses UK spelling conventions and code PEP8 conventions. It provides comprehensive documentation for students learning about Python, MongoDB, and data handling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Python and MongoDB Integration Project

Overview

Key Features

Prerequisites

Installation

Project Structure

Data Sources and Formats

Environmental Noise Mapping Data

Supported Data Formats

Usage

1. Prepare Data

2. Loading Data

3. Running Queries

4. Customising Queries

Database Configuration

Error Handling

Educational Objectives

Troubleshooting

Common Issues

Getting Help

Advanced Features

MongoDB Aggregation Pipelines

Data Visualisation

API Integration

Contributing

License

Author

FilesExpand file tree

Python

Directory actions

More options

Directory actions

More options

Latest commit

History

Python

Folders and files

parent directory

README.md

Python and MongoDB Integration Project

Overview

Key Features

Prerequisites

Installation

Project Structure

Data Sources and Formats

Environmental Noise Mapping Data

Supported Data Formats

Usage

1. Prepare Data

2. Loading Data

3. Running Queries

4. Customising Queries

Database Configuration

Error Handling

Educational Objectives

Troubleshooting

Common Issues

Getting Help

Advanced Features

MongoDB Aggregation Pipelines

Data Visualisation

API Integration

Contributing

License

Author