Skip to content

Add missing array/list functions and aliases #1452

@timsaucer

Description

@timsaucer

Summary

Several array/list functions from upstream DataFusion are not yet exposed in datafusion-python. This includes new functions and missing list_* aliases for existing array_* functions.

Missing Functions (new)

  • array_any_value / list_any_value — returns any non-null element from the array
  • array_contains / list_contains — alias for array_has
  • array_distance / list_distance — computes distance between two arrays
  • array_max / list_max — returns the maximum element
  • array_min / list_min — returns the minimum element
  • array_reverse / list_reverse — reverses elements in the array
  • arrays_overlap — checks if two arrays share any elements
  • arrays_zip / list_zip — zips multiple arrays into an array of structs
  • generate_series — generates a series of values
  • string_to_array / string_to_list — splits a string into an array by delimiter

Missing list_* Aliases

The following list_* aliases exist upstream but are not in __all__:

  • list_empty
  • list_pop_back
  • list_pop_front
  • list_has
  • list_has_all
  • list_has_any

Upstream Reference

Implementation

  • Rust bindings: crates/core/src/functions.rs
  • Python wrappers: python/datafusion/functions.py

Note: This gap analysis was performed using an AI agent comparing upstream DataFusion v53 documentation against the current datafusion-python codebase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions