From 71ced9120a9a6c8e27dd14c265875dd391441965 Mon Sep 17 00:00:00 2001 From: Logan Riggs Date: Mon, 9 Mar 2026 15:36:56 -0700 Subject: [PATCH 1/5] GH-1061 Add codegen classifier jar for arrow-vector. --- vector/pom.xml | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/vector/pom.xml b/vector/pom.xml index b24f37d5f..6a349db97 100644 --- a/vector/pom.xml +++ b/vector/pom.xml @@ -194,6 +194,28 @@ under the License. + + org.apache.maven.plugins + maven-jar-plugin + + + codegen-jar + + jar + + package + + codegen + ${basedir}/src/main/codegen + + **/*.tdd + **/*.fmpp + **/*.ftl + + + + + From 8f7c595d9c4a7c510e81c5c92c1272a2f62ca56f Mon Sep 17 00:00:00 2001 From: Logan Riggs Date: Mon, 9 Mar 2026 18:19:06 -0700 Subject: [PATCH 2/5] Added comment about the usefullness of the codegen files. --- vector/src/main/codegen/config.fmpp | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/vector/src/main/codegen/config.fmpp b/vector/src/main/codegen/config.fmpp index ef5a5072a..d2473ea96 100644 --- a/vector/src/main/codegen/config.fmpp +++ b/vector/src/main/codegen/config.fmpp @@ -13,6 +13,14 @@ # See the License for the specific language governing permissions and # limitations under the License. +# Arrow's type system has many types, and projects building operations across Arrow data +# (comparisons, casts, aggregations) benefit from generating type-specialized code rather +# than hand-writing implementations for each type. + +# The TDD files provide a machine-readable definition of Arrow's types that enables this. +# Including them in the distribution allows downstream projects to generate code that +# stays in sync as Arrow's type system evolves. + data: { # TODO: Rename to ~valueVectorModesAndTypes for clarity. vv: tdd(../data/ValueVectorTypes.tdd), From cc77c5ebbd4d18b81025e19eaabb7b4bfd2cac30 Mon Sep 17 00:00:00 2001 From: Logan Riggs Date: Tue, 10 Mar 2026 10:14:42 -0700 Subject: [PATCH 3/5] Revert "Added comment about the usefullness of the codegen files." This reverts commit 8f7c595d9c4a7c510e81c5c92c1272a2f62ca56f. --- vector/src/main/codegen/config.fmpp | 8 -------- 1 file changed, 8 deletions(-) diff --git a/vector/src/main/codegen/config.fmpp b/vector/src/main/codegen/config.fmpp index d2473ea96..ef5a5072a 100644 --- a/vector/src/main/codegen/config.fmpp +++ b/vector/src/main/codegen/config.fmpp @@ -13,14 +13,6 @@ # See the License for the specific language governing permissions and # limitations under the License. -# Arrow's type system has many types, and projects building operations across Arrow data -# (comparisons, casts, aggregations) benefit from generating type-specialized code rather -# than hand-writing implementations for each type. - -# The TDD files provide a machine-readable definition of Arrow's types that enables this. -# Including them in the distribution allows downstream projects to generate code that -# stays in sync as Arrow's type system evolves. - data: { # TODO: Rename to ~valueVectorModesAndTypes for clarity. vv: tdd(../data/ValueVectorTypes.tdd), From 89658c74cdc184e2bdf0673a5343438d1c56bb8d Mon Sep 17 00:00:00 2001 From: Logan Riggs Date: Tue, 10 Mar 2026 10:15:07 -0700 Subject: [PATCH 4/5] Documentation --- docs/source/overview.rst | 3 +++ vector/pom.xml | 6 ++++++ 2 files changed, 9 insertions(+) diff --git a/docs/source/overview.rst b/docs/source/overview.rst index be579c149..118805411 100644 --- a/docs/source/overview.rst +++ b/docs/source/overview.rst @@ -45,6 +45,9 @@ but some modules are JNI bindings to the C++ library. * - arrow-vector - An off-heap reference implementation for Arrow columnar data format. - Native + * - arrow-vector-codegen + - Template files for Arrow datatypes suitable for code generation. + - Native * - arrow-tools - Java applications for working with Arrow ValueVectors. - Native diff --git a/vector/pom.xml b/vector/pom.xml index 6a349db97..6391aafc5 100644 --- a/vector/pom.xml +++ b/vector/pom.xml @@ -205,6 +205,12 @@ under the License. package + codegen ${basedir}/src/main/codegen From fd0ec69bd2dc828b6bde6aec3eeab50b4da62e7d Mon Sep 17 00:00:00 2001 From: Logan Riggs Date: Tue, 10 Mar 2026 10:48:47 -0700 Subject: [PATCH 5/5] Remove a space --- vector/pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/vector/pom.xml b/vector/pom.xml index 6391aafc5..f46bd0e7b 100644 --- a/vector/pom.xml +++ b/vector/pom.xml @@ -209,7 +209,7 @@ under the License. (comparisons, casts, aggregations) benefit from generating type-specialized code rather than hand-writing implementations for each type. The TDD files provide a machine-readable definition of Arrow's types that enables this. - Including them in the distribution allows downstream projects to generate code that + Including them in the distribution allows downstream projects to generate code that stays in sync as Arrow's type system evolves.--> codegen ${basedir}/src/main/codegen