Check your table

Use tryparse_summary.

SmallDatasetMaker.tryparse_summaryFunction

tryparse_summary(v::AbstractVector, typetoparse::Type{<:Any})

Example

julia> tryparse_summary(["1", "2", "3.3", 10, "NaN"], Float64) .|> typeof
5-element Vector{DataType}:
 SmallDatasetMaker.NotException
 SmallDatasetMaker.NotException
 SmallDatasetMaker.NotException
 MethodError
 SmallDatasetMaker.NotException
source

tryparse_summary(df::AbstractDataFrame, typetoparse) returns a "long" dataframe with columns :variable_name, :exception_type and :exception_msg.

Example

using DataFrames
df = DataFrame(
    :name => ["John", "Roe", "Mary", "Hello", "World"],
    :salary => [5.372, "1.1", "1", "NaN", "#value"],
    :age => string.([20, 13, 17, 22, 100])
)
summary = tryparse_summary(df, Float64)
combine(groupby(summary, [:variable_name, :exception_type, :exception_msg]), nrow)
source
SmallDatasetMaker.difftablesFunction

Given a series of DataFrames, difftables(df0::AbstractDataFrame, dfs::AbstractDataFrame...; ignoring = Cols()) returns report::DataFrame with columns

  • :nrow: number of rows of each DataFrame.
  • :ncol: number of columns of each DataFrame.
  • :cols_lack: lack of columns comparing to df0.
  • :cols_add: extra columns comparing to df0.

This function is useful for update an existing dataset (where the new data might have unidentical column names).

source