Skip to content

Commit fe1bb95

Browse files
author
Devdutt Shenoi
authored
Merge branch 'main' into query-param
2 parents 43db1b6 + dbc2845 commit fe1bb95

File tree

10 files changed

+142
-61
lines changed

10 files changed

+142
-61
lines changed

CONTRIBUTING.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
### Contributing
2+
3+
This document outlines how you can contribute to Parseable.
4+
5+
Thank you for considering to contribute to Parseable. The goal of this document is to provide everything you need to start your contribution. We encourage all contributions, including but not limited to:
6+
- Code patches, bug fixes.
7+
- Tutorials or blog posts.
8+
- Improving the documentation.
9+
- Submitting [bug reports](https://github.com/parseablehq/parseable/issues/new).
10+
11+
### Prerequisites
12+
13+
- Your PR has an associated issue. Find an [existing issue](https://github.com/parseablehq/parseable/issues) or open a new issue.
14+
- You've discussed the with [Parseable community](https://logg.ing/community).
15+
- You're familiar with GitHub Pull Requests(PR) workflow.
16+
- You've read the [Parseable documentation](https://www.parseable.com/docs).
17+
18+
### Contribution workflow
19+
20+
- Fork the [Parseable repository](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) in your own GitHub account.
21+
- [Create a new branch](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-and-deleting-branches-within-your-repository).
22+
- Review the Development Workflow section that describes the steps to maintain the repository.
23+
- Make your changes on your branch.
24+
- Submit the branch as a Pull Request pointing to the main branch of the Parseable repository. A maintainer should comment and/or review your Pull Request within a few hours.
25+
- You’ll be asked to review & sign the [Parseable Contributor License Agreement (CLA)](https://github.com/parseablehq/.github/blob/main/CLA.md) on the GitHub PR. Please ensure that you review the document. Once you’re ready, please sign the CLA by adding a comment I have read the CLA Document and I hereby sign the CLA in your Pull Request.
26+
- Please ensure that the commits in your Pull Request are made by the same GitHub user (which was used to create the PR and add the comment).
27+
28+
### Development workflow
29+
30+
We recommend Linux or macOS as the development platform for Parseable.
31+
32+
#### Setup and run Parseable
33+
34+
Parseable needs Rust 1.77.1 or above. Use the below command to build and run Parseable binary with local mode.
35+
36+
```sh
37+
cargo run --release local-store
38+
```
39+
40+
We recommend using the --release flag to test the full performance of Parseable.
41+
42+
#### Running Tests
43+
44+
```sh
45+
cargo test
46+
```
47+
48+
This command will be triggered to each PR as a requirement for merging it.
49+
50+
If you get a "Too many open files" error you might want to increase the open file limit using this command:
51+
52+
```sh
53+
ulimit -Sn 3000
54+
```
55+
56+
If you get a OpenSSL related error while building Parseable, you might need to install the dependencies using this command:
57+
58+
```sh
59+
sudo apt install build-essential
60+
sudo apt-get install libssl-dev
61+
sudo apt-get install pkg-config
62+
```
63+
64+
### Git Guidelines
65+
66+
- The PR title should be accurate and descriptive of the changes.
67+
- The draft PRs are recommended when you want to show that you are working on something and make your work visible. Convert your PR as a draft if your changes are a work in progress. Maintainers will review the PR once you mark your PR as ready for review.
68+
- The branch related to the PR must be up-to-date with main before merging. We use Bors to automatically enforce this requirement without the PR author having to rebase manually.

Cargo.lock

Lines changed: 2 additions & 9 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cross.toml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
[target.aarch64-unknown-linux-gnu]
2+
image = "ghcr.io/cross-rs/aarch64-unknown-linux-gnu@sha256:1e2a0291f92a4372cbc22d8994e735473045383f1ce7fa44a16c234ba00187f4"
3+
4+
[target.x86_64-unknown-linux-gnu]
5+
image = "ghcr.io/cross-rs/x86_64-unknown-linux-gnu@sha256:bf05360bb9d6d4947eed60532ac7a0d7e8fae8f214e9abb801d5941c8fe4918d"

src/alerts/alerts_utils.rs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ use tracing::trace;
3131

3232
use crate::{
3333
alerts::AggregateCondition,
34+
parseable::PARSEABLE,
3435
query::{TableScanVisitor, QUERY_SESSION},
3536
rbac::{
3637
map::SessionKey,
@@ -137,8 +138,9 @@ async fn execute_base_query(
137138
AlertError::CustomError(format!("Table name not found in query- {}", original_query))
138139
})?;
139140

141+
let time_partition = PARSEABLE.get_stream(&stream_name)?.get_time_partition();
140142
query
141-
.get_dataframe(stream_name)
143+
.get_dataframe(time_partition.as_ref())
142144
.await
143145
.map_err(|err| AlertError::CustomError(err.to_string()))
144146
}

src/alerts/mod.rs

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ use ulid::Ulid;
3737
pub mod alerts_utils;
3838
pub mod target;
3939

40-
use crate::parseable::PARSEABLE;
40+
use crate::parseable::{StreamNotFound, PARSEABLE};
4141
use crate::query::{TableScanVisitor, QUERY_SESSION};
4242
use crate::rbac::map::SessionKey;
4343
use crate::storage;
@@ -514,17 +514,16 @@ impl AlertConfig {
514514

515515
// for now proceed in a similar fashion as we do in query
516516
// TODO: in case of multiple table query does the selection of time partition make a difference? (especially when the tables don't have overlapping data)
517-
let stream_name = if let Some(stream_name) = query.first_table_name() {
518-
stream_name
519-
} else {
517+
let Some(stream_name) = query.first_table_name() else {
520518
return Err(AlertError::CustomError(format!(
521519
"Table name not found in query- {}",
522520
self.query
523521
)));
524522
};
525523

524+
let time_partition = PARSEABLE.get_stream(&stream_name)?.get_time_partition();
526525
let base_df = query
527-
.get_dataframe(stream_name)
526+
.get_dataframe(time_partition.as_ref())
528527
.await
529528
.map_err(|err| AlertError::CustomError(err.to_string()))?;
530529

@@ -704,6 +703,8 @@ pub enum AlertError {
704703
CustomError(String),
705704
#[error("Invalid State Change: {0}")]
706705
InvalidStateChange(String),
706+
#[error("{0}")]
707+
StreamNotFound(#[from] StreamNotFound),
707708
}
708709

709710
impl actix_web::ResponseError for AlertError {
@@ -717,6 +718,7 @@ impl actix_web::ResponseError for AlertError {
717718
Self::DatafusionError(_) => StatusCode::INTERNAL_SERVER_ERROR,
718719
Self::CustomError(_) => StatusCode::BAD_REQUEST,
719720
Self::InvalidStateChange(_) => StatusCode::BAD_REQUEST,
721+
Self::StreamNotFound(_) => StatusCode::NOT_FOUND,
720722
}
721723
}
722724

src/cli.rs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -260,11 +260,12 @@ pub struct Options {
260260
help = "Set a fixed memory limit for query in GiB"
261261
)]
262262
pub query_memory_pool_size: Option<usize>,
263-
263+
// reduced the max row group size from 1048576
264+
// smaller row groups help in faster query performance in multi threaded query
264265
#[arg(
265266
long,
266267
env = "P_PARQUET_ROW_GROUP_SIZE",
267-
default_value = "1048576",
268+
default_value = "262144",
268269
help = "Number of rows in a row group"
269270
)]
270271
pub row_group_size: usize,

src/handlers/airplane.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ use crate::handlers::http::query::{into_query, update_schema_when_distributed};
3838
use crate::handlers::livetail::cross_origin_config;
3939
use crate::metrics::QUERY_EXECUTE_TIME;
4040
use crate::parseable::PARSEABLE;
41-
use crate::query::{TableScanVisitor, QUERY_SESSION};
41+
use crate::query::{execute, TableScanVisitor, QUERY_SESSION};
4242
use crate::utils::arrow::flight::{
4343
append_temporary_events, get_query_from_ticket, into_flight_data, run_do_get_rpc,
4444
send_to_ingester,
@@ -215,8 +215,8 @@ impl FlightService for AirServiceImpl {
215215
Status::permission_denied("User Does not have permission to access this")
216216
})?;
217217
let time = Instant::now();
218-
let (records, _) = query
219-
.execute(stream_name.clone())
218+
219+
let (records, _) = execute(query, &stream_name)
220220
.await
221221
.map_err(|err| Status::internal(err.to_string()))?;
222222

src/handlers/http/query.rs

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,9 @@ use crate::handlers::http::fetch_schema;
3939
use crate::event::commit_schema;
4040
use crate::metrics::QUERY_EXECUTE_TIME;
4141
use crate::option::Mode;
42-
use crate::parseable::PARSEABLE;
42+
use crate::parseable::{StreamNotFound, PARSEABLE};
4343
use crate::query::error::ExecuteError;
44-
use crate::query::{CountsRequest, CountsResponse, Query as LogicalQuery};
44+
use crate::query::{execute, CountsRequest, CountsResponse, Query as LogicalQuery};
4545
use crate::query::{TableScanVisitor, QUERY_SESSION};
4646
use crate::rbac::Users;
4747
use crate::response::QueryResponse;
@@ -130,7 +130,8 @@ pub async fn query(req: HttpRequest, query_request: Query) -> Result<HttpRespons
130130

131131
return Ok(HttpResponse::Ok().json(response));
132132
}
133-
let (records, fields) = query.execute(table_name.clone()).await?;
133+
134+
let (records, fields) = execute(query, &table_name).await?;
134135

135136
let response = QueryResponse {
136137
records,
@@ -318,6 +319,8 @@ Description: {0}"#
318319
ActixError(#[from] actix_web::Error),
319320
#[error("Error: {0}")]
320321
Anyhow(#[from] anyhow::Error),
322+
#[error("Error: {0}")]
323+
StreamNotFound(#[from] StreamNotFound),
321324
}
322325

323326
impl actix_web::ResponseError for QueryError {

src/parseable/mod.rs

Lines changed: 8 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ use http::{header::CONTENT_TYPE, HeaderName, HeaderValue, StatusCode};
2828
use once_cell::sync::Lazy;
2929
pub use staging::StagingError;
3030
use streams::StreamRef;
31-
pub use streams::{StreamNotFound, Streams};
31+
pub use streams::{Stream, StreamNotFound, Streams};
3232
use tracing::error;
3333

3434
#[cfg(feature = "kafka")]
@@ -451,23 +451,17 @@ impl Parseable {
451451
.await;
452452
}
453453

454-
let time_partition_in_days = if !time_partition_limit.is_empty() {
455-
Some(validate_time_partition_limit(&time_partition_limit)?)
456-
} else {
457-
None
458-
};
459-
460-
if let Some(custom_partition) = &custom_partition {
461-
validate_custom_partition(custom_partition)?;
462-
}
463-
464-
if !time_partition.is_empty() && custom_partition.is_some() {
454+
if !time_partition.is_empty() || !time_partition_limit.is_empty() {
465455
return Err(StreamError::Custom {
466-
msg: "Cannot set both time partition and custom partition".to_string(),
456+
msg: "Creating stream with time partition is not supported anymore".to_string(),
467457
status: StatusCode::BAD_REQUEST,
468458
});
469459
}
470460

461+
if let Some(custom_partition) = &custom_partition {
462+
validate_custom_partition(custom_partition)?;
463+
}
464+
471465
let schema = validate_static_schema(
472466
body,
473467
stream_name,
@@ -479,7 +473,7 @@ impl Parseable {
479473
self.create_stream(
480474
stream_name.to_string(),
481475
&time_partition,
482-
time_partition_in_days,
476+
None,
483477
custom_partition.as_ref(),
484478
static_schema_flag,
485479
schema,

0 commit comments

Comments
 (0)