Skip to content

Commit 99f48f2

Browse files
committed
Merge remote-tracking branch 'origin/main' into bm/move-getting-started-1
2 parents a3a9bed + 97508b8 commit 99f48f2

File tree

132 files changed

+6923
-384
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

132 files changed

+6923
-384
lines changed

.github/workflows/docs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323
uses: actions/checkout@v5
2424

2525
- name: Set up Python
26-
uses: actions/setup-python@v5
26+
uses: actions/setup-python@v6
2727
with:
2828
python-version: "3.11"
2929
cache: 'pip'

.github/workflows/nightly.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323
uses: actions/checkout@v5
2424

2525
- name: Set up Python
26-
uses: actions/setup-python@v5
26+
uses: actions/setup-python@v6
2727
with:
2828
python-version: "3.11"
2929
cache: 'pip'

docs/_include/links.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
[HNSW]: https://en.wikipedia.org/wiki/Hierarchical_navigable_small_world
3636
[HNSW paper]: https://arxiv.org/pdf/1603.09320
3737
[HoloViews]: https://www.holoviews.org/
38+
[HoloViz]: https://holoviz.org/
3839
[Indexing, Columnar Storage, and Aggregations]: https://cratedb.com/product/features/indexing-columnar-storage-aggregations
3940
[InfluxDB]: https://github.com/influxdata/influxdb
4041
[inverted index]: https://en.wikipedia.org/wiki/Inverted_index

docs/conf.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,11 +77,20 @@
7777
r"https://openai.com/index/gpt-4/.*",
7878
# 403 Client Error: Forbidden for url
7979
r"https://www.npmjs.com/",
80+
r"https://www.computerhope.com/",
81+
# Out of service.
82+
r"https://s3.amazonaws.com/nyc-tlc/.*",
83+
# 2025-09-29: Phased out CrateDB 3.3 docs
84+
r"https://cratedb.com/docs/crate/reference/en/3.3/",
85+
# 403 Client Error: Forbidden for url
86+
r"https://docs.docker.com/",
8087
]
8188

8289
linkcheck_anchors_ignore_for_url += [
8390
# Anchor 'XXX' not found
84-
r"https://pypi.org/.*"
91+
r"https://pypi.org/.*",
92+
# https://kafka.apache.org/documentation/#topicconfigs - Anchor 'topicconfigs' not found
93+
r"https://kafka.apache.org/.*",
8594
]
8695

8796
# Configure intersphinx.
@@ -110,6 +119,7 @@
110119
"readme_github": "[![README](https://img.shields.io/badge/Open-README-darkblue?logo=GitHub)]",
111120
"blog": "[![Blog](https://img.shields.io/badge/Open-Blog-darkblue?logo=Markdown)]",
112121
"tutorial": "[![Navigate to Tutorial](https://img.shields.io/badge/Navigate%20to-Tutorial-darkcyan?logo=Markdown)]",
122+
"guide": "[![Navigate to usage guide](https://img.shields.io/badge/Navigate%20to-usage%20guide-darkcyan?logo=Markdown)]",
113123
"readmore": "[![Read More](https://img.shields.io/badge/Read-More-darkyellow?logo=Markdown)]",
114124
})
115125

docs/connect/df/index.md

Lines changed: 4 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -6,103 +6,18 @@
66

77
How to use CrateDB together with popular open-source DataFrame libraries.
88

9-
(dask)=
109
## Dask
11-
12-
:::{rubric} About
13-
:::
14-
[Dask] is a parallel computing library for analytics with task scheduling.
15-
It is built on top of the Python programming language, making it easy to scale
16-
the Python libraries that you know and love, like NumPy, pandas, and scikit-learn.
17-
18-
```{div}
19-
:style: "float: right"
20-
[![](https://github.com/crate/crate-clients-tools/assets/453543/99bd2234-c501-479b-ade7-bcc2bfc1f288){w=180px}](https://www.dask.org/)
21-
```
22-
23-
- [Dask DataFrames] help you process large tabular data by parallelizing pandas,
24-
either on your laptop for larger-than-memory computing, or on a distributed
25-
cluster of computers.
26-
27-
- [Dask Futures], implementing a real-time task framework, allow you to scale
28-
generic Python workflows across a Dask cluster with minimal code changes,
29-
by extending Python's `concurrent.futures` interface.
30-
31-
```{div}
32-
:style: "clear: both"
33-
```
34-
35-
:::{rubric} Learn
10+
:::{seealso}
11+
Please navigate to the dedicated page about {ref}`dask`.
3612
:::
37-
- [Guide to efficient data ingestion to CrateDB with pandas and Dask]
38-
- [Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]
39-
- [Import weather data using Dask]
40-
- [Dask code examples]
41-
4213

43-
(pandas)=
4414
## pandas
45-
46-
:::{rubric} About
47-
:::
48-
49-
```{div}
50-
:style: "float: right"
51-
[![](https://pandas.pydata.org/static/img/pandas.svg){w=180px}](https://pandas.pydata.org/)
52-
```
53-
54-
[pandas] is a fast, powerful, flexible, and easy-to-use open-source data analysis
55-
and manipulation tool, built on top of the Python programming language.
56-
57-
Pandas (stylized as pandas) is a software library written for the Python programming
58-
language for data manipulation and analysis. In particular, it offers data structures
59-
and operations for manipulating numerical tables and time series.
60-
61-
:::{rubric} Data Model
62-
:::
63-
- Pandas is built around data structures called Series and DataFrames. Data for these
64-
collections can be imported from various file formats such as comma-separated values,
65-
JSON, Parquet, SQL database tables or queries, and Microsoft Excel.
66-
- A Series is a 1-dimensional data structure built on top of NumPy's array.
67-
- Pandas includes support for time series, such as the ability to interpolate values
68-
and filter using a range of timestamps.
69-
- By default, a Pandas index is a series of integers ascending from 0, similar to the
70-
indices of Python arrays. However, indices can use any NumPy data type, including
71-
floating point, timestamps, or strings.
72-
- Pandas supports hierarchical indices with multiple values per data point. An index
73-
with this structure, called a "MultiIndex", allows a single DataFrame to represent
74-
multiple dimensions, similar to a pivot table in Microsoft Excel. Each level of a
75-
MultiIndex can be given a unique name.
76-
77-
```{div}
78-
:style: "clear: both"
79-
```
80-
81-
:::{rubric} Learn
15+
:::{seealso}
16+
Please navigate to the dedicated page about {ref}`pandas`.
8217
:::
83-
- [Guide to efficient data ingestion to CrateDB with pandas]
84-
- [Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy]
85-
- [pandas code examples]
86-
- [From data storage to data analysis: Tutorial on CrateDB and pandas]
8718

8819

8920
## Polars
9021
:::{seealso}
9122
Please navigate to the dedicated page about {ref}`polars`.
9223
:::
93-
94-
95-
[Apache Arrow]: https://arrow.apache.org/
96-
[Dask]: https://www.dask.org/
97-
[Dask DataFrames]: https://docs.dask.org/en/latest/dataframe.html
98-
[Dask Futures]: https://docs.dask.org/en/latest/futures.html
99-
[pandas]: https://pandas.pydata.org/
100-
101-
[Dask code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/dask
102-
[Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]: https://cratedb.com/docs/python/en/latest/by-example/sqlalchemy/dataframe.html
103-
[From data storage to data analysis: Tutorial on CrateDB and pandas]: https://community.cratedb.com/t/from-data-storage-to-data-analysis-tutorial-on-cratedb-and-pandas/1440
104-
[Guide to efficient data ingestion to CrateDB with pandas]: https://community.cratedb.com/t/guide-to-efficient-data-ingestion-to-cratedb-with-pandas/1541
105-
[Guide to efficient data ingestion to CrateDB with pandas and Dask]: https://community.cratedb.com/t/guide-to-efficient-data-ingestion-to-cratedb-with-pandas-and-dask/1482
106-
[Import weather data using Dask]: https://github.com/crate/cratedb-examples/blob/main/topic/timeseries/dask-weather-data-import.ipynb
107-
[Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy]: https://community.cratedb.com/t/importing-parquet-files-into-cratedb-using-apache-arrow-and-sqlalchemy/1161
108-
[pandas code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/pandas

docs/connect/drivers.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ For connecting to CrateDB from any environment that supports it.
4747
[Npgsql](https://www.npgsql.org/)
4848
```
4949
```{sd-item}
50-
An open source ADO.NET Data Provider for PostgreSQL, for program written in C#,
50+
An open source ADO\.NET Data Provider for PostgreSQL, for program written in C#,
5151
Visual Basic, and F#.
5252
```
5353
```{sd-item}

docs/feature/document/index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ Learn fundamentals about CrateDB's OBJECT data type.
254254
::::
255255

256256

257-
:::{rubric} Tutorials
257+
:::{rubric} Guides
258258
:::
259259

260260
::::{info-card}
@@ -292,11 +292,11 @@ Today's data management tasks need to handle multi-structured and
292292
different data sources. CrateDB's dynamic OBJECT data type allows you to
293293
store and analyze complex and nested data efficiently.
294294

295-
In this tutorial, we will explore how to leverage this feature in marketing
295+
In this usage guide, we will explore how to leverage this feature in marketing
296296
data analysis, along with the use of [generated columns], to parse and manage
297297
URLs.
298298

299-
{{ '{}(#objects-basics)'.format(tutorial) }}
299+
{{ '{}(#objects-basics)'.format(guide) }}
300300
:::
301301

302302
:::{grid-item}
@@ -427,5 +427,5 @@ and about OBJECT indexing.
427427
:maxdepth: 1
428428
:hidden:
429429
430-
Tutorial <tutorial>
430+
Usage <usage>
431431
```

docs/feature/document/tutorial.md renamed to docs/feature/document/usage.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
(objects-basics)=
2+
(objects-usage)=
23

34
# Objects: Analyzing Marketing Data
45

56
Marketers often need to handle multi-structured data from different platforms.
67
CrateDB's dynamic `OBJECT` data type allows us to store and analyze this complex,
7-
nested data efficiently. In this tutorial, we'll explore how to leverage this
8+
nested data efficiently. In this usage guide, we'll explore how to leverage this
89
feature in marketing data analysis, along with the use of generated columns to
910
parse and manage URLs.
1011

@@ -124,5 +125,5 @@ GROUP BY 1
124125
ORDER BY 2 DESC;
125126
:::
126127

127-
In this tutorial, we explored the versatility and power of CrateDB's dynamic
128+
In this usage guide, we explored the versatility and power of CrateDB's dynamic
128129
`OBJECT` data type for handling complex, nested marketing data.

docs/feature/query/index.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,13 @@ It is also not in the same shape as the other pages in this section.
228228
:::
229229

230230

231+
:::{toctree}
232+
:maxdepth: 1
233+
:hidden:
234+
Recurrent queries <recurrent>
235+
:::
236+
237+
231238
[Analyze Device Readings with Metadata Integration]: project:#timeseries-with-metadata
232239
[bulk operations interface]: inv:crate-reference#http-bulk-ops
233240
[bulk operations for INSERTs]: project:#inserts-bulk-operations

0 commit comments

Comments
 (0)