Skip to content

Commit 04d4921

Browse files
authored
Merge pull request #40 from ruivieira/main
feat(lmeval): Add guide for GPU usage in local mode
2 parents 5f37122 + 89714a4 commit 04d4921

File tree

1 file changed

+54
-0
lines changed

1 file changed

+54
-0
lines changed

docs/modules/ROOT/pages/lm-eval-tutorial.adoc

+54
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,14 @@ There are some configurable global settings for LM-Eval services and they are st
5656
|`lmes-pod-checking-interval`
5757
|`10s`
5858
|The interval to check the job pod for an evaluation job.
59+
60+
|`lmes-allow-online`
61+
|`true`
62+
|Whether LMEval jobs can set the online mode on.
63+
64+
|`lmes-code-execution`
65+
|`true`
66+
|Whether LMEval jobs can set the trust remote code mode on.
5967
|===
6068

6169

@@ -74,6 +82,7 @@ kind: LMEvalJob
7482
metadata:
7583
name: evaljob-sample
7684
spec:
85+
allowOnline: true
7786
model: hf
7887
modelArgs:
7988
- name: pretrained
@@ -225,6 +234,15 @@ Specify extra information for the lm-eval job's pod.
225234

226235
|`outputs.pvcName`
227236
|Binds an existing PVC to a job by specifying its name. The PVC must be created separately and must already exist when creating the job.
237+
238+
|`allowOnline`
239+
|If set to `true`, the LMEval job will download artifacts as needed (e.g. models, datasets or tokenizers). If set to `false`, these will not be downloaded and will be used from local storage. See `offline`.
240+
241+
|`allowCodeExecution`
242+
|If set to `true`, the LMEval job will execute the necessary code for preparing models or datasets. If set to `false` it will not execute downloaded code.
243+
244+
|`offline`
245+
|Mount a PVC as the local storage for models and datasets.
228246
|===
229247

230248
== Examples
@@ -491,6 +509,42 @@ oc get secrets -o custom-columns=SECRET:.metadata.name --no-headers | grep user-
491509
Then, apply this CR into the same namespace as your model. You should see a pod spin up in your
492510
model namespace called `evaljob`. In the pod terminal, you can see the output via `tail -f output/stderr.log`
493511

512+
=== Using GPUs
513+
514+
Typically, when using an Inference Service, GPU acceleration will be performed at the model server level. However, when using local mode, i.e. running the evaluation locally on the LMEval Job, you might want to use available GPUs. To do so, we can add a resource configuration directly on the job's definition:
515+
516+
[source,yaml]
517+
----
518+
apiVersion: trustyai.opendatahub.io/v1alpha1
519+
kind: LMEvalJob
520+
metadata:
521+
name: evaljob-sample
522+
spec:
523+
model: hf
524+
modelArgs:
525+
- name: pretrained
526+
value: google/flan-t5-base
527+
taskList:
528+
taskNames:
529+
- "qnlieu"
530+
logSamples: true
531+
allowOnline: true
532+
allowCodeExecution: true
533+
pod: <1>
534+
container:
535+
resources:
536+
limits: <2>
537+
cpu: '1'
538+
memory: 8Gi
539+
nvidia.com/gpu: '1'
540+
requests:
541+
cpu: '1'
542+
memory: 8Gi
543+
nvidia.com/gpu: '1'
544+
----
545+
<1> The `pod` section allows adding specific resource definitions to the LMEval Job.
546+
<2> In this case we are adding `cpu: 1`, `memory: 8Gi` and `nvidia.com/gpu: 1`, but these can be adjusted to your cluster's availability.
547+
494548
=== Integration with Kueue
495549

496550
[NOTE]

0 commit comments

Comments
 (0)