You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/modules/ROOT/pages/lm-eval-tutorial.adoc
+54
Original file line number
Diff line number
Diff line change
@@ -56,6 +56,14 @@ There are some configurable global settings for LM-Eval services and they are st
56
56
|`lmes-pod-checking-interval`
57
57
|`10s`
58
58
|The interval to check the job pod for an evaluation job.
59
+
60
+
|`lmes-allow-online`
61
+
|`true`
62
+
|Whether LMEval jobs can set the online mode on.
63
+
64
+
|`lmes-code-execution`
65
+
|`true`
66
+
|Whether LMEval jobs can set the trust remote code mode on.
59
67
|===
60
68
61
69
@@ -74,6 +82,7 @@ kind: LMEvalJob
74
82
metadata:
75
83
name: evaljob-sample
76
84
spec:
85
+
allowOnline: true
77
86
model: hf
78
87
modelArgs:
79
88
- name: pretrained
@@ -225,6 +234,15 @@ Specify extra information for the lm-eval job's pod.
225
234
226
235
|`outputs.pvcName`
227
236
|Binds an existing PVC to a job by specifying its name. The PVC must be created separately and must already exist when creating the job.
237
+
238
+
|`allowOnline`
239
+
|If set to `true`, the LMEval job will download artifacts as needed (e.g. models, datasets or tokenizers). If set to `false`, these will not be downloaded and will be used from local storage. See `offline`.
240
+
241
+
|`allowCodeExecution`
242
+
|If set to `true`, the LMEval job will execute the necessary code for preparing models or datasets. If set to `false` it will not execute downloaded code.
243
+
244
+
|`offline`
245
+
|Mount a PVC as the local storage for models and datasets.
Then, apply this CR into the same namespace as your model. You should see a pod spin up in your
492
510
model namespace called `evaljob`. In the pod terminal, you can see the output via `tail -f output/stderr.log`
493
511
512
+
=== Using GPUs
513
+
514
+
Typically, when using an Inference Service, GPU acceleration will be performed at the model server level. However, when using local mode, i.e. running the evaluation locally on the LMEval Job, you might want to use available GPUs. To do so, we can add a resource configuration directly on the job's definition:
515
+
516
+
[source,yaml]
517
+
----
518
+
apiVersion: trustyai.opendatahub.io/v1alpha1
519
+
kind: LMEvalJob
520
+
metadata:
521
+
name: evaljob-sample
522
+
spec:
523
+
model: hf
524
+
modelArgs:
525
+
- name: pretrained
526
+
value: google/flan-t5-base
527
+
taskList:
528
+
taskNames:
529
+
- "qnlieu"
530
+
logSamples: true
531
+
allowOnline: true
532
+
allowCodeExecution: true
533
+
pod: <1>
534
+
container:
535
+
resources:
536
+
limits: <2>
537
+
cpu: '1'
538
+
memory: 8Gi
539
+
nvidia.com/gpu: '1'
540
+
requests:
541
+
cpu: '1'
542
+
memory: 8Gi
543
+
nvidia.com/gpu: '1'
544
+
----
545
+
<1> The `pod` section allows adding specific resource definitions to the LMEval Job.
546
+
<2> In this case we are adding `cpu: 1`, `memory: 8Gi` and `nvidia.com/gpu: 1`, but these can be adjusted to your cluster's availability.
0 commit comments