|
2 | 2 |
|
3 | 3 | # Google Bigtable Rust PostgreSQL FDW
|
4 | 4 |
|
5 |
| -## IGNORE DOCS BELOW, PROPER ONES COMING SOON :) |
| 5 | +[Rust](https://www.rust-lang.org/en-US/) [PostgreSQL](https://www.postgresql.org/) foreign data wrapper for interfacing with [Google Cloud Bigtable](https://cloud.google.com/bigtable/), as well as other API compatible databases ([HBase](https://hbase.apache.org/) should work with some effort). |
6 | 6 |
|
7 |
| -[Rust](https://www.rust-lang.org/en-US/) [PostgreSQL](https://www.postgresql.org/) extension for interfacing with [Google Cloud Bigtable](https://cloud.google.com/bigtable/), as well as other API compatible databases ([HBase](https://hbase.apache.org/) should work with some effort). |
8 |
| - |
9 |
| -While logic is contained in `Rust`, it leverages `PostgreSQL ` `C` macros for passing parameters around as well returning values, and for ease of use it is all wrapped in `PL/pgSQL`. |
10 |
| - |
11 |
| -At the moment **reading** and **writing** form/to `Bigtable` is supported with *deleting* and *updating* on the roadmap, as well as a few other more Bigtable oriented features, and encrypted credential storage. At present it is more of an exercise in writing a `PostgreSQL` extension in `Rust` than anything else. |
| 7 | +While logic is contained in `Rust`, it leverages `PostgreSQL ` `C` FDW callbacks. |
12 | 8 |
|
| 9 | +### Roadmap |
| 10 | +[x] `select` |
| 11 | +[x] `select limit` |
| 12 | +[ ] `select offset` |
| 13 | +[ ] `select where` |
| 14 | +[x] `insert` |
| 15 | +[ ] `update` |
| 16 | +[ ] `delete` |
13 | 17 |
|
14 | 18 | ## Installation
|
15 |
| -+ `PostgreSQL 9.3+` |
| 19 | ++ `PostgreSQL 9.6+` |
16 | 20 | + `Stable Rust 1.15+`, get it using [rustup](https://www.rustup.rs/).
|
17 | 21 |
|
18 | 22 | ```bash
|
19 |
| -git clone https://github.com/durch/google-bigtable-postgres-extension.git |
20 |
| -cd google-bigtable-postgres-extension |
| 23 | +git clone https://github.com/durch/google-bigtable-postgres-fdw.git |
| 24 | +cd google-bigtable-postgres-fdw |
21 | 25 | make install
|
22 | 26 | psql -U postgres
|
23 | 27 | ```
|
24 | 28 |
|
25 |
| -Once inside the DB |
| 29 | +### Initial DB setup |
26 | 30 |
|
27 |
| -```sql |
| 31 | +```plpgsql |
28 | 32 | CREATE EXTENSION bigtable;
|
| 33 | +CREATE SERVER test FOREIGN DATA WRAPPER bigtable OPTIONS (instance '`instance_id`', project '`project_id`'); |
| 34 | +CREATE FOREIGN TABLE test(bt json) SERVER test OPTIONS (name '`table_name`'); |
| 35 | +CREATE USER MAPPING FOR postgres SERVER TEST OPTIONS (credentials_path '`path_to_service_account_json_credentials`'); |
29 | 36 | ```
|
30 | 37 |
|
31 |
| -The command above will also output a message about inputing your credentials, which you can get from Google Cloud Console, the intention is to work using service accounts, you'll need proper have [scopes](https://cloud.google.com/bigtable/docs/creating-compute-instance) in order to be able to use the extension. Once you have the `json` credential file, you can feed it in: |
32 |
| - |
33 |
| -```sql |
34 |
| -SELECT bt_set_credentials('<absolute_path_to_gcloud_json_credentials>'); |
35 |
| -``` |
36 |
| - |
37 |
| -The contents of the file are read, `base64` encoded and stored in `bt_auth_config` table, this is completely insecure and you should take care not allow bad guys access to this key, especially if it has admin scopes. |
38 |
| - |
39 |
| -You can delete the file from the system after reading it in, it will be dumped and restored along with other `DB` tables. |
| 38 | +### Usage |
40 | 39 |
|
41 |
| -## Usage |
| 40 | +You can use [gen.py]() to generate some test data. Modify `gen.py` to adjust for the number of generated records, also modify the`column` key in the generated output as this needs be a `column familly` that **exists** in your Bigtable, running `python gen.py` outputs `test.sql`, which can be fed into PG. `WHERE` is evaluted on the PG side so be sure to grab what you need from BT. |
42 | 41 |
|
43 |
| -### Reading |
| 42 | +``` |
| 43 | +psql -U postgres < test.sql |
| 44 | +``` |
44 | 45 |
|
45 |
| -```sql |
46 |
| -# SIGNATURE |
47 |
| -bt_read_rows(instance_name TEXT, table_name TEXT, limit INT) RETURNS JSON |
| 46 | +#### SELECT |
48 | 47 |
|
49 |
| -# EXAMPLE |
50 |
| -# Reading 10 rows from test_table in test-instance |
51 |
| -SELECT bt_read_rows('test-instance', 'test-table', 10); |
| 48 | +One Bigtable row per PG rowis returned, limit is done on the BT side, rows are returned as `json` and can be further manipulated using Postgres `json` [functions and operators](`https://www.postgresql.org/docs/9.6/static/functions-json.html`). |
52 | 49 |
|
53 |
| -# Output will be valid json, it will fail if the table is empty |
54 | 50 | ```
|
| 51 | +SELECT * FROM test; |
| 52 | +SELECT * FROM test LIMIT 100; |
55 | 53 |
|
56 |
| -### Writing |
57 |
| - |
58 |
| -#### Writing one row at a time |
59 |
| - |
60 |
| -```sql |
61 |
| -# SIGNATURES |
62 |
| -bt_write_one(column_family TEXT, column_qulifier TEXT, rows TEXT, instance_name TEXT, table_name TEXT) RETURNS TEXT |
| 54 | +SELECT bt->'familyName', bt->'qualifier' FROM test WHERE bt->>'rowKey' ~* '.*regex.*'; |
| 55 | +SELECT bt->'familyName', bt->'qualifier' FROM test WHERE bt->>'rowKey' = 'exact'; |
| 56 | +``` |
63 | 57 |
|
64 |
| -bt_write_one(column_family TEXT, column_qulifier TEXT, rows JSON, instance_name TEXT, table_name TEXT) RETURNS TEXT |
| 58 | +#### INSERT |
65 | 59 |
|
66 |
| -# EXAMPLES |
67 |
| -SELECT bt_write_one('cf1', 't', 'Sample text row', 'test-instance','test-table'); |
| 60 | +`INSERT` format is a bit weird ATM: |
68 | 61 |
|
69 |
| -SELECT bt_write_one('cf1', 't', '"Sample json text row"', 'test-instance','test-table'); |
| 62 | +```json |
70 | 63 |
|
71 |
| -SELECT bt_write_one('cf1', 't', '{"json_object_row": true}', 'test-instance','test-table'); |
| 64 | +{ |
| 65 | + "row_key": string, |
| 66 | + "column": string, |
| 67 | + "column_qualifier": string, |
| 68 | + "data": [ |
| 69 | + json |
| 70 | + ] |
| 71 | +} |
72 | 72 |
|
73 |
| -SELECT bt_write_one('cf1', 't', '["json", "array", "row"]', 'test-instance','test-table'); |
74 | 73 | ```
|
75 | 74 |
|
76 |
| -#### Writing many rows at once |
77 |
| - |
78 |
| -In the case of `bt_write_many` `json` *arrays* are unpacked into rows, all other `json` types will be written as one row, as if using `bt_write_one`. |
| 75 | +Currently `row_key` is treated as a prefix and concated with a loop counter, while this covers a few use cases it is not really ideal for Bigtable. This will likely be extended to allow passing of a `row_key` array. As you are passing in one `json` object which gets expanded, `INSERT` counter always shows one row inserted, truth can be found in PG logs. |
79 | 76 |
|
80 |
| -This should enable straightforward import of data to Bigtable as PostgreSQL has some very nice [JSON functions](https://www.postgresql.org/docs/9.6/static/functions-json.html) for formatting and converting data. |
81 |
| - |
82 |
| -```sql |
83 |
| -# SIGNATURE |
84 |
| -bt_write_many(column_family TEXT, column_qulifier TEXT, rows JSON, instance_name IN TEXT, table_name) RETURNS TEXT |
85 |
| - |
86 |
| -# EXAMPLE |
87 |
| -SELECT bt_write_many('cf1', 't', '["this", "will", "make", 5, "rows"]', 'test-instance', 'test-table'); |
88 |
| -``` |
0 commit comments