Description
I have the following MLIR program:
test.mlir:
module {
func.func nested @func1() -> f32 {
%idx0 = index.constant 0
%idx1 = index.constant 1
%true = arith.constant true
%false = arith.constant false
%alloc_33 = memref.alloc() : memref<11xi1>
linalg.fill ins(%true : i1) outs(%alloc_33 : memref<11xi1>)
%alloc_147 = memref.alloc() : memref<11xi1>
linalg.fill ins(%false : i1) outs(%alloc_147 : memref<11xi1>)
memref.copy %alloc_147, %alloc_33 : memref<11xi1> to memref<11xi1>
%dim = memref.dim %alloc_33, %idx0 : memref<11xi1>
%0 = scf.for %arg1 = %idx0 to %dim step %idx1 iter_args(%arg2 = %false) -> (i1) {
%1 = memref.load %alloc_33[%arg1] : memref<11xi1>
vector.print %1 : i1
%2 = arith.addi %arg2, %1 : i1
scf.yield %2 : i1
}
vector.print %0 : i1
%1 = arith.sitofp %0 : i1 to f32
return %1 : f32
}
}
When I ran /data/tmp/v1102/llvm-project/build/bin/mlir-opt --convert-vector-to-llvm --convert-linalg-to-loops --convert-scf-to-cf --finalize-memref-to-llvm --convert-arith-to-llvm --convert-func-to-llvm --convert-index-to-llvm --reconcile-unrealized-casts test.mlir | /data/tmp/v1102/llvm-project/build/bin/mlir-cpu-runner -e func1 --shared-libs=/data/tmp/v1102/llvm-project/build/lib/libmlir_runner_utils.so,/data/tmp/v1102/llvm-project/build/lib/libmlir_c_runner_utils.so
on the program, I got the result of:
0
0
0
0
0
0
0
0
0
0
0
0
0.000000e+00
However, when I ran /data/tmp/v1102/llvm-project/build/bin/mlir-opt --test-linalg-transform-patterns=test-patterns --convert-vector-to-llvm --convert-linalg-to-loops --convert-scf-to-cf --finalize-memref-to-llvm --convert-arith-to-llvm --convert-func-to-llvm --convert-index-to-llvm --reconcile-unrealized-casts test.mlir | /data/tmp/v1102/llvm-project/build/bin/mlir-cpu-runner -e func1 --shared-libs=/data/tmp/v1102/llvm-project/build/lib/libmlir_runner_utils.so,/data/tmp/v1102/llvm-project/build/lib/libmlir_c_runner_utils.so
on the program, I got the result of:
0
0
1
1
1
1
1
1
1
1
1
1
-1.000000e+00
The above two results seem to be inconsistent. I'm not sure if there is any bug in my program or if the wrong usage of the above passes caused these results.
My git version is 33bdb53.
Activity
AnonymousBugreporter1 commentedon Nov 29, 2024
I tried to reproduce this issue on history commit versions, and I found these inconsistence results can be reproduced on commit ebc8153, and cannot be reproduced on the previous commit 9c52a19.
To satisfy the history syntax and pass usage, I made some changes to the program, and the adjusted test.mlir is:
When I ran
on the program, I got the result of:
However, when I ran
on the program, I got the result of:
Hi @pifon2a, sorry to disturb but I was wondering if you would mind taking a look at this problem?
AnonymousBugreporter1 commentedon Dec 2, 2024
Hi @nicolasvasilache and @banach-space, sorry to disturb you, but I noticed that you have reviewed the related commit or have worked on the same file, and I was wondering if it might be possible for you to take a look at this problem when you have a moment?
pifon2a commentedon Dec 2, 2024
We had a lot of different problems in TensorFlow and XLA, because i1 was actually an 8-bit type instead of 1-bit. Could it be something like that?
banach-space commentedon Dec 2, 2024
Thanks for reporting this @wangyongj1a !
Just to confirm, this is broken using LLVM ToT? (Top Of Tree) And, it's this specific flag that brakes things:
--test-linalg-transform-patterns=test-patterns
? These patterns are defined here:
llvm-project/mlir/test/lib/Dialect/Linalg/TestLinalgTransforms.cpp
Lines 136 to 151 in fe1c4f0
My suspicion would be this bit:
(void)applyPatternsAndFoldGreedily(funcOp, std::move(patterns));
I'd start by reverting these changes in MemRefOps.cpp (
CopyOp::fold
specifically). Could you try that?AnonymousBugreporter1 commentedon Dec 5, 2024
Thanks for your response!
This problem requires
--test-linalg-transform-patterns=test-patterns
to reproduce. However, the inconsistency may be not from(void)applyPatternsAndFoldGreedily(funcOp, std::move(patterns));
Consider the condition of
CopyOp::fold
(i.e., the operand ofmemref.copy
is defined bycast
operation). In this case, two operands are defined bymemref.alloc
, which does not match the condition.Further, I ran the
--test-linalg-transform-patterns=test-patterns
singly, and I got the following output:Note that
vector.transfer_read
andvector.transfer_write
is generated inCopyVectorizationPattern
, which means the inconsistency is frompatterns.add<CopyVectorizationPattern>(ctx);
.Besides, this problem can still be reproduced on 7d1c661 using the above commands.
And I compared the LLVM IR obtained by two commands, the LLVM IR that uses
is:
The LLVM IR that uses
is:
The above two LLVM IRs are the same for me.
Does the inconsistency come from the process that converts LLVM IR to Assembly?