This release introduces a new function to summarize project metadata.
The new refine_project_summary()
function queries the
running OpenRefine instance and returns high-level summary information
about each project. The summary information is pulled from the
OpenRefine metadata API and includes project identifier, name, date
modified, date created, description, and number of rows in the
project.
Tested with OpenRefine 3.4.1, 3.5.0, and 3.6.2 running on Linux and Mac OSX.
This major release introduces a significant new feature that allows users to perform data cleaning operations in OpenRefine through an API query.
The new functionality passes JSON-specified operations to the running
instance via the /command/core/apply-operations
endpoint.
In addition to the generic refine_operations()
that can
flexibly accept any valid JSON operation, the rrefine
package includes a series of wrapper functions to perform common data
cleaning procedures:
refine_remove_column()
: Remove a column from a
projectrefine_add_column()
: Add a column to a projectrefine_rename_column()
: Rename an existing column in a
projectrefine_move_column()
: Move a column to a new indexrefine_transform()
: Apply arbitrary text
transformationsrefine_to_lower()
: Coerce text to lowercaserefine_to_upper()
: Coerce text to uppercaserefine_to_title()
: Coerce text to title caserefine_to_null()
: Set values to NULL
refine_to_empty()
: Set text values to empty string
(""
)refine_to_text()
: Coerce value to stringrefine_to_number()
: Coerce value to numericrefine_to_date()
: Coerce value to daterefine_trim_whitespace()
: Remove leading and trailing
whitespacesrefine_collapse_whitespace()
: Collapse consecutive
whitespaces to single whitespacerefine_unescape_html()
: Unescape HTML in stringIn addition to the data cleaning operations functionality, the documentation has been updated throughout to point to the current OpenRefine user manual (https://docs.openrefine.org/).
Tested with OpenRefine 3.4.1 and 3.5.0 running on Linux and Mac OSX.
Minor release to incorporate new features.
refine_export
the user can now specify
“col_types” for tabular format returned. Thanks to @joelnitta for the
contribution!The only update in this release is the removal of one of the package
dependencies (rlist
), which has been scheduled to be
archived per the CRAN team. This change is required for continued
distribution of rrefine
via CRAN. Functions from
rlist
were only used in an unexported rrefine
helper, and there are no anticipated user-facing changes in this
release.
This release includes a number of new features, more robust checks
and internal logic, and many improvements to package documentation. Most
significantly, this version introduces support for the Cross-Site
Request Forgery (CSRF) token in OpenRefine API requests, which is
required in certain API calls as of OpenRefine v3.3. This feature is now
included in rrefine
but operates internally and should be
invisible to users. For more information the OpenRefine CSRF protection
see:
https://github.com/OpenRefine/OpenRefine/wiki/Changes-for-3.3#csrf-protection-changes
refine_metadata()
function is now exported and
user-facing.http://127.0.0.1:3333
)refine_upload()
function now checks file format
based on extension, and allows both .csv and .tsv files to be
uploaded.refine_upload(..., open.browser=TRUE)
will now redirect the
user to the newly created project in the OpenRefine instance.tibble
with the data in
R.refine_upload()
and refine_delete()
functions now confirm success of operations by comparing metadata
before/after POST requests.refine_query()
internal helper function.refine_id()
helper function now validates
“project.id” against list of project ids in the running instance.news.md
to track release notes!refine_delete()
and
refine_upload()
now generate a CSFR token internally.Tested with OpenRefine 3.2, 3.3, and 3.4.1 running on Linux and Mac OSX.
NOTE The
rrefine
package was released to CRAN under version 1.0. However, theDESCRIPTION
file for that release noted the version as 0.1. All releases from v1.1.0 onwards will maintain consistency between the version in theDESCRIPTION
and the version number on the CRAN release.