@@ -7,29 +7,56 @@ At present, the accuracy of the paper cannot be achieved. And i borrowed code fr
7
7
** model**
8
8
<img src =' ./demo_image/SRN.png ' >
9
9
10
+ ** result**
11
+ | IIIT5k_3000 | SVT | IC03_860 | IC03_867 | IC13_857 | IC13_1015 | IC15_1811 | IC15_2077 | SVTP | CUTE80 |
12
+ | ----------- | ------| ---------| ---------| ---------| --------- | ----------| --------- | ---- | ------ |
13
+ | 84.600 | 83.617| 92.907 | 92.849 | 90.315 | 88.177 | 71.010 | 68.064 | 71.008 | 68.641 |
10
14
15
+ ** total_accuracy: 80.597**
16
+
17
+ ---
11
18
12
19
** Feature**
13
20
- predict the character at once time
14
21
- DistributedDataParallel training
15
22
16
23
17
24
25
+
18
26
---
19
27
## Requirements
20
28
Pytorch >= 1.1.0
21
29
22
30
23
31
## Test
24
- coming soon ...
32
+ 1 . download the evaluation data from [ deep-text-recognition-benchmark] ( https://github.com/clovaai/deep-text-recognition-benchmark )
33
+
34
+ 2 . download the pretrained model from [ Baidu] ( https://pan.baidu.com/s/1E5xeajIl_fvtrGWyrE9CeA ) , Password: d2qn
35
+
36
+ 3 . test on the evaluation data
37
+ ``` bash
38
+ python test.py --eval_data path-to-data --saved_model path-to-model
39
+ ```
25
40
26
41
---
27
42
28
43
## Train
29
- coming soon ...
44
+ 1 . download the training data from [ deep-text-recognition-benchmark] ( https://github.com/clovaai/deep-text-recognition-benchmark )
45
+
46
+ 2 . training from scratch
47
+ ``` bash
48
+ python train.py --train_data path-to-train-data --valid-data path-to-valid-data
49
+ ```
30
50
31
51
## Reference
32
52
1 . [ bert_ocr.pytorch] ( https://github.com/chenjun2hao/Bert_OCR.pytorch )
33
53
2 . [ deep-text-recognition-benchmark] ( https://github.com/clovaai/deep-text-recognition-benchmark )
34
54
3 . [ 2D Attentional Irregular Scene Text Recognizer] ( https://arxiv.org/pdf/1906.05708.pdf )
35
- 4 . [ Towards Accurate Scene Text Recognition with Semantic Reasoning Networks] ( https://arxiv.org/abs/2003.12294 )
55
+ 4 . [ Towards Accurate Scene Text Recognition with Semantic Reasoning Networks] ( https://arxiv.org/abs/2003.12294 )
56
+
57
+ ## difference with the origin paper
58
+ - use resnet for 1D feature not resnetFpn 2D feature
59
+ - use add not gated unit for visual-semanti fusion decoder
60
+
61
+ ## other
62
+ It is difficult to achieve the accuracy of the paper, hope more people to try and share
0 commit comments