Skip to content

Commit f3e589c

Browse files
committed
Not too bad limited Beta
1 parent 27d827d commit f3e589c

File tree

10 files changed

+188
-1251
lines changed

10 files changed

+188
-1251
lines changed

Cargo.toml

+2
Original file line numberDiff line numberDiff line change
@@ -20,5 +20,7 @@ protobuf = "1.0.24"
2020
rustc-serialize = "0.3.22"
2121
goauth = "0.2.12"
2222

23+
lazy_static = "0.2.4"
24+
2325
[build-dependencies]
2426
gcc = "*"

Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
EXTENSION = bigtable
2-
EXTVERSION = 0.0.1
2+
EXTVERSION = 0.1.0
33

44
DATA = sql/$(EXTENSION)--$(EXTVERSION).sql
55
PG_CONFIG = pg_config

README.md

+42-54
Original file line numberDiff line numberDiff line change
@@ -2,87 +2,75 @@
22

33
# Google Bigtable Rust PostgreSQL FDW
44

5-
## IGNORE DOCS BELOW, PROPER ONES COMING SOON :)
5+
[Rust](https://www.rust-lang.org/en-US/) [PostgreSQL](https://www.postgresql.org/) foreign data wrapper for interfacing with [Google Cloud Bigtable](https://cloud.google.com/bigtable/), as well as other API compatible databases ([HBase](https://hbase.apache.org/) should work with some effort).
66

7-
[Rust](https://www.rust-lang.org/en-US/) [PostgreSQL](https://www.postgresql.org/) extension for interfacing with [Google Cloud Bigtable](https://cloud.google.com/bigtable/), as well as other API compatible databases ([HBase](https://hbase.apache.org/) should work with some effort).
8-
9-
While logic is contained in `Rust`, it leverages `PostgreSQL ` `C` macros for passing parameters around as well returning values, and for ease of use it is all wrapped in `PL/pgSQL`.
10-
11-
At the moment **reading** and **writing** form/to `Bigtable` is supported with *deleting* and *updating* on the roadmap, as well as a few other more Bigtable oriented features, and encrypted credential storage. At present it is more of an exercise in writing a `PostgreSQL` extension in `Rust` than anything else.
7+
While logic is contained in `Rust`, it leverages `PostgreSQL ` `C` FDW callbacks.
128

9+
### Roadmap
10+
[x] `select`
11+
[x] `select limit`
12+
[ ] `select offset`
13+
[ ] `select where`
14+
[x] `insert`
15+
[ ] `update`
16+
[ ] `delete`
1317

1418
## Installation
15-
+ `PostgreSQL 9.3+`
19+
+ `PostgreSQL 9.6+`
1620
+ `Stable Rust 1.15+`, get it using [rustup](https://www.rustup.rs/).
1721

1822
```bash
19-
git clone https://github.com/durch/google-bigtable-postgres-extension.git
20-
cd google-bigtable-postgres-extension
23+
git clone https://github.com/durch/google-bigtable-postgres-fdw.git
24+
cd google-bigtable-postgres-fdw
2125
make install
2226
psql -U postgres
2327
```
2428

25-
Once inside the DB
29+
### Initial DB setup
2630

27-
```sql
31+
```plpgsql
2832
CREATE EXTENSION bigtable;
33+
CREATE SERVER test FOREIGN DATA WRAPPER bigtable OPTIONS (instance '`instance_id`', project '`project_id`');
34+
CREATE FOREIGN TABLE test(bt json) SERVER test OPTIONS (name '`table_name`');
35+
CREATE USER MAPPING FOR postgres SERVER TEST OPTIONS (credentials_path '`path_to_service_account_json_credentials`');
2936
```
3037

31-
The command above will also output a message about inputing your credentials, which you can get from Google Cloud Console, the intention is to work using service accounts, you'll need proper have [scopes](https://cloud.google.com/bigtable/docs/creating-compute-instance) in order to be able to use the extension. Once you have the `json` credential file, you can feed it in:
32-
33-
```sql
34-
SELECT bt_set_credentials('<absolute_path_to_gcloud_json_credentials>');
35-
```
36-
37-
The contents of the file are read, `base64` encoded and stored in `bt_auth_config` table, this is completely insecure and you should take care not allow bad guys access to this key, especially if it has admin scopes.
38-
39-
You can delete the file from the system after reading it in, it will be dumped and restored along with other `DB` tables.
38+
### Usage
4039

41-
## Usage
40+
You can use [gen.py]() to generate some test data. Modify `gen.py` to adjust for the number of generated records, also modify the`column` key in the generated output as this needs be a `column familly` that **exists** in your Bigtable, running `python gen.py` outputs `test.sql`, which can be fed into PG. `WHERE` is evaluted on the PG side so be sure to grab what you need from BT.
4241

43-
### Reading
42+
```
43+
psql -U postgres < test.sql
44+
```
4445

45-
```sql
46-
# SIGNATURE
47-
bt_read_rows(instance_name TEXT, table_name TEXT, limit INT) RETURNS JSON
46+
#### SELECT
4847

49-
# EXAMPLE
50-
# Reading 10 rows from test_table in test-instance
51-
SELECT bt_read_rows('test-instance', 'test-table', 10);
48+
One Bigtable row per PG rowis returned, limit is done on the BT side, rows are returned as `json` and can be further manipulated using Postgres `json` [functions and operators](`https://www.postgresql.org/docs/9.6/static/functions-json.html`).
5249

53-
# Output will be valid json, it will fail if the table is empty
5450
```
51+
SELECT * FROM test;
52+
SELECT * FROM test LIMIT 100;
5553
56-
### Writing
57-
58-
#### Writing one row at a time
59-
60-
```sql
61-
# SIGNATURES
62-
bt_write_one(column_family TEXT, column_qulifier TEXT, rows TEXT, instance_name TEXT, table_name TEXT) RETURNS TEXT
54+
SELECT bt->'familyName', bt->'qualifier' FROM test WHERE bt->>'rowKey' ~* '.*regex.*';
55+
SELECT bt->'familyName', bt->'qualifier' FROM test WHERE bt->>'rowKey' = 'exact';
56+
```
6357

64-
bt_write_one(column_family TEXT, column_qulifier TEXT, rows JSON, instance_name TEXT, table_name TEXT) RETURNS TEXT
58+
#### INSERT
6559

66-
# EXAMPLES
67-
SELECT bt_write_one('cf1', 't', 'Sample text row', 'test-instance','test-table');
60+
`INSERT` format is a bit weird ATM:
6861

69-
SELECT bt_write_one('cf1', 't', '"Sample json text row"', 'test-instance','test-table');
62+
```json
7063

71-
SELECT bt_write_one('cf1', 't', '{"json_object_row": true}', 'test-instance','test-table');
64+
{
65+
"row_key": string,
66+
"column": string,
67+
"column_qualifier": string,
68+
"data": [
69+
json
70+
]
71+
}
7272

73-
SELECT bt_write_one('cf1', 't', '["json", "array", "row"]', 'test-instance','test-table');
7473
```
7574

76-
#### Writing many rows at once
77-
78-
In the case of `bt_write_many` `json` *arrays* are unpacked into rows, all other `json` types will be written as one row, as if using `bt_write_one`.
75+
Currently `row_key` is treated as a prefix and concated with a loop counter, while this covers a few use cases it is not really ideal for Bigtable. This will likely be extended to allow passing of a `row_key` array. As you are passing in one `json` object which gets expanded, `INSERT` counter always shows one row inserted, truth can be found in PG logs.
7976

80-
This should enable straightforward import of data to Bigtable as PostgreSQL has some very nice [JSON functions](https://www.postgresql.org/docs/9.6/static/functions-json.html) for formatting and converting data.
81-
82-
```sql
83-
# SIGNATURE
84-
bt_write_many(column_family TEXT, column_qulifier TEXT, rows JSON, instance_name IN TEXT, table_name) RETURNS TEXT
85-
86-
# EXAMPLE
87-
SELECT bt_write_many('cf1', 't', '["this", "will", "make", 5, "rows"]', 'test-instance', 'test-table');
88-
```

bigtable.control

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# base36 extension in rust
22
comment = 'Rust extension for interfaceing with Google Bigtable from PostgreSQL'
3-
default_version = '0.0.1'
3+
default_version = '0.1.0'
44
module_pathname = '$libdir/bigtable_pg_ext'
55
relocatable = true

gen.py

+12-4
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,21 @@
22
import random
33
import string
44

5+
N_RECORDS = 1000 # number of records generated
6+
OUT_FILE = "test.sql"
7+
8+
# Be sure to modify the `column key below`
9+
510
def id_gen(N=64):
611
return ''.join(random.choice(string.ascii_lowercase + string.digits) for _ in range(N))
712

813
def coord_gen(rng=90):
914
return random.randint(rng*-1, rng)
1015

11-
l = [{"id": id_gen(), "lat": coord_gen(90), "lng": coord_gen(180)} for _ in xrange(100000)]
12-
dt = json.dumps({"row_key": id_gen(), "column": id_gen(), "column_qualifier": id_gen(), "data": l})
13-
with open("test.sql", "wb") as fp:
14-
fp.write("insert into test values('{}');".format(dt))
16+
if __name__ = "__main__":
17+
18+
l = [{"id": id_gen(), "lat": coord_gen(90), "lng": coord_gen(180)} for _ in xrange(N_RECORDS)]
19+
20+
dt = json.dumps({"row_key": id_gen(), "column": 'cf1', "column_qualifier": 'test', "data": l})
21+
with open(OUT_FILE, "wb") as fp:
22+
fp.write("insert into test values('{}');".format(dt))

sql/bigtable.sql

+1-5
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,3 @@
1-
-- drop extension bigtable cascade; create extension bigtable; create server test foreign data wrapper bt_fdw options (instance 'testinst', project 'drazens-bigtable-testing'); create foreign table test(bt json) server test options (name 'test'); create user mapping for postgres server test options (credentials_path '/tmp/code/drazens-bigtable-testing-9f32bd2aa193.json');
2-
--
3-
--
4-
51
-- -- complain if script is sourced in psql, rather than via CREATE EXTENSION
62
\echo Use "CREATE EXTENSION bigtable" to load this file. \quit
73

@@ -15,6 +11,6 @@ RETURNS void
1511
AS '$libdir/bigtable'
1612
LANGUAGE C STRICT;
1713

18-
CREATE FOREIGN DATA WRAPPER bt_fdw
14+
CREATE FOREIGN DATA WRAPPER bigtable
1915
HANDLER bt_fdw_handler
2016
VALIDATOR bt_fdw_validator;

src/bt_fdw.c

+3
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ btGetForeignRelSize(PlannerInfo *root,
4949
RelOptInfo *baserel,
5050
Oid foreigntableid) {
5151
elog(LOG, "entering function %s", __func__);
52+
5253
baserel->rows = 500;
5354
}
5455

@@ -87,6 +88,8 @@ btGetForeignPlan(PlannerInfo *root,
8788

8889
Index scan_relid = baserel->relid;
8990

91+
get_limit(root);
92+
9093
scan_clauses = extract_actual_clauses(scan_clauses, false);
9194

9295
return make_foreignscan(tlist,

src/bt_fdw.h

+3
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,9 @@ bt_fdw_iterate_foreign_scan(bt_fdw_state_t *, ForeignScanState *);
9797
extern void *
9898
bt_fdw_exec_foreign_insert(bt_fdw_state_t *, TupleTableSlot *, char *);
9999

100+
extern void *
101+
get_limit(PlannerInfo *);
102+
100103
/*
101104
* FDW functions declarations
102105
*/

0 commit comments

Comments
 (0)