Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 16 additions & 78 deletions 01-docker-terraform/2_docker_sql/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,61 +17,21 @@ wget https://github.com/DataTalksClub/nyc-tlc-data/releases/download/yellow/yell

### Running Postgres with Docker
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using docker managed volumes, the commands should work the same between operating systems


#### Windows

Running Postgres on Windows (note the full path)
Running Postgres on Windows, macOS and Linux

```bash
docker run -it \
-e POSTGRES_USER="root" \
-e POSTGRES_PASSWORD="root" \
-e POSTGRES_DB="ny_taxi" \
-v c:/Users/alexe/git/data-engineering-zoomcamp/week_1_basics_n_setup/2_docker_sql/ny_taxi_postgres_data:/var/lib/postgresql/data \
-p 5432:5432 \
postgres:13
```

If you have the following error:

```
docker run -it \
-e POSTGRES_USER="root" \
-e POSTGRES_PASSWORD="root" \
-e POSTGRES_DB="ny_taxi" \
-v e:/zoomcamp/data_engineer/week_1_fundamentals/2_docker_sql/ny_taxi_postgres_data:/var/lib/postgresql/data \
-v ny_taxi_postgres_data:/var/lib/postgresql \
-p 5432:5432 \
postgres:13

docker: Error response from daemon: invalid mode: \Program Files\Git\var\lib\postgresql\data.
See 'docker run --help'.
```

Change the mounting path. Replace it with the following:

```
-v /e/zoomcamp/...:/var/lib/postgresql/data
```

#### Linux and MacOS


```bash
docker run -it \
-e POSTGRES_USER="root" \
-e POSTGRES_PASSWORD="root" \
-e POSTGRES_DB="ny_taxi" \
-v $(pwd)/ny_taxi_postgres_data:/var/lib/postgresql/data \
-p 5432:5432 \
postgres:13
postgres:18
```

If you see that `ny_taxi_postgres_data` is empty after running
the container, try these:

* Deleting the folder and running Docker again (Docker will re-create the folder)
* Adjust the permissions of the folder by running `sudo chmod a+rwx ny_taxi_postgres_data`


### CLI for Postgres

Installing `pgcli`
Expand Down Expand Up @@ -125,7 +85,7 @@ Running pgAdmin
docker run -it \
-e PGADMIN_DEFAULT_EMAIL="[email protected]" \
-e PGADMIN_DEFAULT_PASSWORD="root" \
-p 8080:80 \
-p 8085:80 \
dpage/pgadmin4
```

Expand All @@ -144,11 +104,11 @@ docker run -it \
-e POSTGRES_USER="root" \
-e POSTGRES_PASSWORD="root" \
-e POSTGRES_DB="ny_taxi" \
-v c:/Users/alexe/git/data-engineering-zoomcamp/week_1_basics_n_setup/2_docker_sql/ny_taxi_postgres_data:/var/lib/postgresql/data \
-v ny_taxi_postgres_data:/var/lib/postgresql \
-p 5432:5432 \
--network=pg-network \
--name pg-database \
postgres:13
--name pgdatabase \
postgres:18
```

Run pgAdmin
Expand All @@ -157,14 +117,14 @@ Run pgAdmin
docker run -it \
-e PGADMIN_DEFAULT_EMAIL="[email protected]" \
-e PGADMIN_DEFAULT_PASSWORD="root" \
-p 8080:80 \
-p 8085:80 \
--network=pg-network \
--name pgadmin-2 \
dpage/pgadmin4
```


### Data ingestion
### Data Ingestion

Running locally

Expand All @@ -187,22 +147,6 @@ Build the image
docker build -t taxi_ingest:v001 .
```

On Linux you may have a problem building it:

```
error checking context: 'can't stat '/home/name/data_engineering/ny_taxi_postgres_data''.
```

You can solve it with `.dockerignore`:

* Create a folder `data`
* Move `ny_taxi_postgres_data` to `data` (you might need to use `sudo` for that)
* Map `-v $(pwd)/data/ny_taxi_postgres_data:/var/lib/postgresql/data`
* Create a file `.dockerignore` and add `data` there
* Check [this video](https://www.youtube.com/watch?v=tOr4hTsHOzU&list=PL3MmuxUbc_hJed7dXYoJw8DoCuVHhGEQb) (the middle) for more details



Run the script with Docker

```bash
Expand All @@ -213,47 +157,41 @@ docker run -it \
taxi_ingest:v001 \
--user=root \
--password=root \
--host=pg-database \
--host=pgdatabase \
--port=5432 \
--db=ny_taxi \
--table_name=yellow_taxi_trips \
--url=${URL}
```

### Docker-Compose
### Docker Compose

Run it:

```bash
docker-compose up
docker compose up
```

Run in detached mode:

```bash
docker-compose up -d
docker compose up -d
```

Shutting it down:

```bash
docker-compose down
```

Note: to make pgAdmin configuration persistent, create a folder `data_pgadmin`. Change its permission via

```bash
sudo chown 5050:5050 data_pgadmin
docker compose down
```

and mount it to the `/var/lib/pgadmin` folder:
Add a docker volume to the `pgadmin` container:

```yaml
services:
pgadmin:
image: dpage/pgadmin4
volumes:
- ./data_pgadmin:/var/lib/pgadmin
- data_pgadmin:/var/lib/pgadmin
...
```

Expand Down
10 changes: 7 additions & 3 deletions 01-docker-terraform/2_docker_sql/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
volumes:
ny_taxi_postgres_data:
driver: local

services:
pgdatabase:
image: postgres:13
image: postgres:18
environment:
- POSTGRES_USER=root
- POSTGRES_PASSWORD=root
- POSTGRES_DB=ny_taxi
volumes:
- "./ny_taxi_postgres_data:/var/lib/postgresql/data:rw"
- ny_taxi_postgres_data:/var/lib/postgresql
ports:
- "5432:5432"
pgadmin:
Expand All @@ -15,5 +19,5 @@ services:
- [email protected]
- PGADMIN_DEFAULT_PASSWORD=root
ports:
- "8080:80"
- "8085:80"

Loading