Check your table
Use tryparse_summary
.
SmallDatasetMaker.tryparse_summary
— Functiontryparse_summary(v::AbstractVector, typetoparse::Type{<:Any})
Example
julia> tryparse_summary(["1", "2", "3.3", 10, "NaN"], Float64) .|> typeof
5-element Vector{DataType}:
SmallDatasetMaker.NotException
SmallDatasetMaker.NotException
SmallDatasetMaker.NotException
MethodError
SmallDatasetMaker.NotException
tryparse_summary(df::AbstractDataFrame, typetoparse)
returns a "long" dataframe with columns :variable_name
, :exception_type
and :exception_msg
.
Example
using DataFrames
df = DataFrame(
:name => ["John", "Roe", "Mary", "Hello", "World"],
:salary => [5.372, "1.1", "1", "NaN", "#value"],
:age => string.([20, 13, 17, 22, 100])
)
summary = tryparse_summary(df, Float64)
combine(groupby(summary, [:variable_name, :exception_type, :exception_msg]), nrow)
SmallDatasetMaker.difftables
— FunctionGiven a series of DataFrame
s, difftables(df0::AbstractDataFrame, dfs::AbstractDataFrame...; ignoring = Cols())
returns report::DataFrame
with columns
:nrow
: number of rows of eachDataFrame
.:ncol
: number of columns of eachDataFrame
.:cols_lack
: lack of columns comparing todf0
.:cols_add
: extra columns comparing todf0
.
This function is useful for update an existing dataset (where the new data might have unidentical column names).