-
Notifications
You must be signed in to change notification settings - Fork 47
HTTP-122 Retry for source lookup table #148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
HTTP-122 Retry for source lookup table #148
Conversation
src/test/java/com/getindata/connectors/http/internal/status/HttpCodesParserTest.java
Outdated
Show resolved
Hide resolved
src/main/java/com/getindata/connectors/http/internal/retry/HttpClientWithRetry.java
Outdated
Show resolved
Hide resolved
Hey @davidradl ! Please take a look at this PR as well. Your feedback would be appreciated. |
@@ -452,33 +508,42 @@ be requested if the current time is later than the cached token expiry time minu | |||
## Table API Connector Options | |||
### HTTP TableLookup Source | |||
|
|||
| Option | Required | Description/Value | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very difficult to see what has changed. Please could you amend the change so it is minimal especially for this table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CHANGELOG.md
Outdated
@@ -2,6 +2,10 @@ | |||
|
|||
## [Unreleased] | |||
|
|||
- Added support for auto-retry for source table. Auto retry on IOException and user-defined http codes - parameter `gid.connector.http.source.lookup.retry-codes`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
normally one pr would be one line here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, set as one item with subitems
README.md
Outdated
#### Retry strategy | ||
User can choose retry strategy type for source table: | ||
- fixed-delay - http request will be re-sent after specified delay | ||
- exponential-delay - request will be re-sent with exponential backoff strategy, limited to max-retries attempts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that the config option is lookup.max-retries (I suggest using the exact config parameter name) - do we need a separate config for max-retries for sinks?
It would be worth defining exactly what we mean by exponential backoff strategy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, good idea. I added explanation
`setProperty' method from Sink's builder. The property names are: | ||
- `gid.connector.http.sink.error.code` and `gid.connector.http.source.lookup.error.code` used to defined HTTP status code value that should be treated as error for example 404. | ||
`setProperty' method from Sink's builder. The property name are: | ||
- `gid.connector.http.sink.error.code` used to defined HTTP status code value that should be treated as error for example 404. | ||
Many status codes can be defined in one value, where each code should be separated with comma, for example: | ||
`401, 402, 403`. User can use this property also to define a type code mask. In that case, all codes from given HTTP response type will be treated as errors. | ||
An example of such a mask would be `3XX, 4XX, 5XX`. In this case, all 300s, 400s and 500s status codes will be treated as errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is X the only mask character?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's part of sink configuration, which wasn't changed. It allows only [1-5]XX or exact http code value. I reimplemented handling of http response code only for source table, which is connected with retry feature.
README.md
Outdated
Many status codes can be defined in one value, where each code should be separated with comma, for example: | ||
`401, 402, 403`. In this example, codes 401, 402 and 403 would not be interpreted as error codes. | ||
|
||
### Source table | ||
Http source requires success codes defined in parameter: `gid.connector.http.source.lookup.success-codes`. That list should contains all http status codes | ||
which are considered as success response. It may be 200 (ok) as well as 404 (not found). The first one is standard response and its content should be deserialized/parsed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: response -> responses
I am not sure what we mean by " It may be 200 (ok) as well as 404 (not found). The first one is standard response and its content should be deserialized/parsed." Is 200 and 404 the defaults or recommended settings?
README.md
Outdated
### Source table | ||
Http source requires success codes defined in parameter: `gid.connector.http.source.lookup.success-codes`. That list should contains all http status codes | ||
which are considered as success response. It may be 200 (ok) as well as 404 (not found). The first one is standard response and its content should be deserialized/parsed. | ||
Processing of 404 request's content may be skipped by adding it to parameter `gid.connector.http.source.lookup.ignored-response-codes`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does skipped mean here - fail the job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The section was edited. Could you check if it's clear now?
|
||
## TODO | ||
|
||
### HTTP TableLookup Source | ||
- Think about Retry Policy for Http Request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this a TODO? I think the TODOs are reminders to developers - rather than a user checklist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it looks like reminders. I just removed implemented feature from that. Do you want me to remove the whole section TODO?
</dependency> | ||
<!-- Add logging framework, to produce console output when running in the IDE. --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have the logging change in a separate PR - it is easier to track the history then please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change is needed anyways?
Its because of resilence4j?
One of the "rule of thumbs" when we were starting this connector was to try not add any external libraries to the connector, that my or may not clash with any user code -> i.e that is why we use Java's 11 http client.
You need the resilence4j for retry functionality right?
Which in essence is -> schedule a task on Java's scheduled thread executor and make sure to do a good job around error/exception handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dedicated lib for retries might be an overkill now, but I think we can benefit in long term. The library provides Rate Limiter or Circuit Breaker. Both features might be worth adding. Or at least Rate Limiter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Why this change is needed anyways?
Its because of resilence4j?"
Yes, I had to change to compile project with resilence4j. Notice that Flink use the same API:
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/advanced/logging/
"You need the resilence4j for retry functionality right?"
I thought it's better to use mature library instead of reimplementing it. From the other side,as you said, it's additional dependency. Below part resilience4j dependencies based on mvn dependency:tree:
[INFO] +- io.github.resilience4j:resilience4j-retry:jar:1.7.1:compile
[INFO] | +- io.vavr:vavr:jar:0.10.2:compile
[INFO] | | \- io.vavr:vavr-match:jar:0.10.2:compile
[INFO] | \- io.github.resilience4j:resilience4j-core:jar:1.7.1:compile
Do you think it's ok to add them? Another option is to shadow them.
private final Retry retry; | ||
|
||
@Builder | ||
HttpClientWithRetry(HttpClient httpClient, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering what happens with OIDC, the short lived bearer token may need to be regenerated if the retries occur after the token has expired). Is this regeneration check done for the retries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The request is created only once, but OIDC processor (responsible for setting bearer token in request) is called on every retry.
...n/java/com/getindata/connectors/http/internal/table/lookup/HttpLookupTableSourceFactory.java
Outdated
Show resolved
Hide resolved
...n/java/com/getindata/connectors/http/internal/table/lookup/HttpLookupTableSourceFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR.
I have few questions though that would be great if you could address them:
Why you need 3rd party library to implement retry? Shouldn't be that hard to use vanilla Java for that.
Btw, do you know how it works under the hood?
Does retry blocks the Flink;s processing or is it asynchronous?
Why you choose to use exceptions to signal that retry is needed? Why not status object or a flag?
3:
There are quite few format changes, why? Could you double check if you have your IDE properly set?
Please start all test methods from "should".
</dependency> | ||
<!-- Add logging framework, to produce console output when running in the IDE. --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change is needed anyways?
Its because of resilence4j?
One of the "rule of thumbs" when we were starting this connector was to try not add any external libraries to the connector, that my or may not clash with any user code -> i.e that is why we use Java's 11 http client.
You need the resilence4j for retry functionality right?
Which in essence is -> schedule a task on Java's scheduled thread executor and make sure to do a good job around error/exception handling.
SOURCE_LOOKUP_HTTP_RETRY_CODES, | ||
SOURCE_LOOKUP_HTTP_IGNORED_RESPONSE_CODES, | ||
|
||
SOURCE_LOOKUP_CONNECTION_TIMEOUT // TODO: add request timeout from properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format change, why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added blank lines to make it more readable (groups like retry, oidc, cache, core etc.).
...n/java/com/getindata/connectors/http/internal/table/lookup/HttpLookupTableSourceFactory.java
Outdated
Show resolved
Hide resolved
...n/java/com/getindata/connectors/http/internal/table/lookup/HttpLookupTableSourceFactory.java
Outdated
Show resolved
Hide resolved
} catch (RetryHttpRequestException retryException) { | ||
throw retryException.getCausedBy(); | ||
} | ||
} catch (IOException | InterruptedException | HttpStatusCodeValidationFailedException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why IOException and InterruptedException are special here?
Why you need then on the send
signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are checked exceptions from HttpClient.send
method. I don't want to repack them.
"Incorrect response code: " + response.statusCode(), response); | ||
if (responseChecker.isTemporalError(response)) { | ||
log.debug("Retrying... Received response with code {} for request {}", response.statusCode(), request); | ||
throw new RetryHttpRequestException(validationFailedException); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need an exception to communicate a retry is needed?
I would say ok, if that would be an exception from Java's client but here you throwing your own exceptuion.
I'm really not a fan of "communication via exceptions". Exception are very costly... plus this looks not right to me.
Better would be some status object, flag etc.
Similar thing was discussed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. You're right. It's better to retry based on response. I fixed the code.
src/test/java/com/getindata/connectors/http/internal/retry/HttpClientWithRetryTest.java
Show resolved
Hide resolved
readme refactor HttpClientWithRetry reimplemented (retry based on code, without throwing exception)
src/main/java/com/getindata/connectors/http/internal/retry/HttpClientWithRetry.java
Outdated
Show resolved
Hide resolved
'gid.connector.http.source.lookup.ignored-response-codes' = '404' | ||
) | ||
``` | ||
All 200s codes and 404 are considered as successful (`success-codes`). These responses won't cause retry or job failure. 404 response is also listed in `ignored-response-codes` parameter, what means content body will be ignored. Http with 404 code will produce just empty record. Notice that 404 has to be specified in both `success-codes` and `ignored-response-codes`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want to take that burden from the user in the future. 404 seems like a good candidate for a default config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarekMaj @grzegorz8
The 404 status may indicates that destination URL is wrong or resource doesn't exists. Marking 404 as successful response may hide configuration errors. Due to that I didn't set this as default value.
Do you think 404 should be successful by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is one way of simplifying the configuration burden. There still should be an option to overwrite that config by the user
Another idea: given that Notice that ignored-response-codes has to be a subset of success-codes.
, maybe defining code (for example 404) as ignored shouldn't require including the same code in success codes?
@kristoffSC @davidradl @MarekMaj |
Description
Retry feature for source lookup table.
PR Checklist