Binary files /tmp/tmpY_lcbK/EOb3mL_tCX/r-cran-dplyr-0.7.6/build/vignette.rds and /tmp/tmpY_lcbK/jGZXINq8AL/r-cran-dplyr-0.7.8/build/vignette.rds differ diff -Nru r-cran-dplyr-0.7.6/debian/changelog r-cran-dplyr-0.7.8/debian/changelog --- r-cran-dplyr-0.7.6/debian/changelog 2018-07-02 09:20:36.000000000 +0000 +++ r-cran-dplyr-0.7.8/debian/changelog 2018-11-18 17:44:24.000000000 +0000 @@ -1,3 +1,24 @@ +r-cran-dplyr (0.7.8-1) unstable; urgency=medium + + * Team upload. + + [ Andreas Tille ] + * Drop now unused override since dh-r cares for this + + [ Dylan Aïssi ] + * New upstream version (Closes: #912003) + * dh-update-R to update Build-Depends + + -- Dylan Aïssi Sun, 18 Nov 2018 18:44:24 +0100 + +r-cran-dplyr (0.7.7-1) unstable; urgency=medium + + * New upstream version + * Standards-Version: 4.2.1 + * documentation is where it is expected by GNU R users + + -- Andreas Tille Sat, 20 Oct 2018 08:20:22 +0200 + r-cran-dplyr (0.7.6-1) unstable; urgency=medium * New upstream version diff -Nru r-cran-dplyr-0.7.6/debian/control r-cran-dplyr-0.7.8/debian/control --- r-cran-dplyr-0.7.6/debian/control 2018-07-02 09:20:36.000000000 +0000 +++ r-cran-dplyr-0.7.8/debian/control 2018-11-18 17:44:24.000000000 +0000 @@ -7,19 +7,19 @@ dh-r, r-base-dev, r-cran-assertthat (>= 0.2.0), - r-cran-bindrcpp (>= 0.2.0.9000), + r-cran-bindrcpp, r-cran-glue, r-cran-magrittr, r-cran-pkgconfig, r-cran-r6 (>= 2.2.2), - r-cran-rcpp (>= 0.12.15), - r-cran-rlang (>= 0.2.0), - r-cran-tibble (>= 1.3.1), + r-cran-rcpp (>= 0.12.19), + r-cran-rlang (>= 0.3.0), + r-cran-tibble (>= 1.4.2), r-cran-tidyselect (>= 0.2.3), r-cran-bh, r-cran-plogr (>= 0.1.10), libboost-dev (>= 1.58.0) -Standards-Version: 4.1.4 +Standards-Version: 4.2.1 Vcs-Browser: https://salsa.debian.org/r-pkg-team/r-cran-dplyr Vcs-Git: https://salsa.debian.org/r-pkg-team/r-cran-dplyr.git Homepage: https://cran.r-project.org/package=dplyr diff -Nru r-cran-dplyr-0.7.6/debian/upstream/metadata r-cran-dplyr-0.7.8/debian/upstream/metadata --- r-cran-dplyr-0.7.6/debian/upstream/metadata 1970-01-01 00:00:00.000000000 +0000 +++ r-cran-dplyr-0.7.8/debian/upstream/metadata 2018-11-18 17:44:24.000000000 +0000 @@ -0,0 +1,7 @@ +Registry: + - Name: OMICtools + Entry: OMICS_29812 + - Name: bio.tools + Entry: NA + - Name: SciCrunch + Entry: NA diff -Nru r-cran-dplyr-0.7.6/DESCRIPTION r-cran-dplyr-0.7.8/DESCRIPTION --- r-cran-dplyr-0.7.6/DESCRIPTION 2018-06-29 21:23:20.000000000 +0000 +++ r-cran-dplyr-0.7.8/DESCRIPTION 2018-11-10 07:30:02.000000000 +0000 @@ -1,7 +1,7 @@ Type: Package Package: dplyr Title: A Grammar of Data Manipulation -Version: 0.7.6 +Version: 0.7.8 Authors@R: c( person("Hadley", "Wickham", , "hadley@rstudio.com", c("aut", "cre"), comment = c(ORCID = "0000-0003-4757-117X")), person("Romain", "Fran\u00e7ois", role = "aut", comment = c(ORCID = "0000-0002-2444-4226")), @@ -17,8 +17,8 @@ Depends: R (>= 3.1.2) Imports: assertthat (>= 0.2.0), bindrcpp (>= 0.2.0.9000), glue (>= 1.1.1), magrittr (>= 1.5), methods, pkgconfig (>= 2.0.1), R6 - (>= 2.2.2), Rcpp (>= 0.12.15), rlang (>= 0.2.0), tibble (>= - 1.3.1), tidyselect (>= 0.2.3), utils + (>= 2.2.2), Rcpp (>= 0.12.19), rlang (>= 0.3.0), tibble (>= + 1.4.2), tidyselect (>= 0.2.3), utils Suggests: bit64 (>= 0.9.7), callr, covr (>= 3.0.1), DBI (>= 0.7.14), dbplyr (>= 1.2.0), dtplyr (>= 0.0.2), ggplot2 (>= 2.2.1), hms (>= 0.4.1), knitr (>= 1.19), Lahman (>= 3.0-1), lubridate, @@ -31,9 +31,9 @@ VignetteBuilder: knitr Encoding: UTF-8 LazyData: yes -RoxygenNote: 6.0.1.9000 +RoxygenNote: 6.1.0 NeedsCompilation: yes -Packaged: 2018-06-27 18:50:04 UTC; kirill +Packaged: 2018-11-09 21:23:40 UTC; kirill Author: Hadley Wickham [aut, cre] (), Romain François [aut] (), Lionel Henry [aut], @@ -41,4 +41,4 @@ RStudio [cph, fnd] Maintainer: Hadley Wickham Repository: CRAN -Date/Publication: 2018-06-29 21:23:20 UTC +Date/Publication: 2018-11-10 07:30:02 UTC diff -Nru r-cran-dplyr-0.7.6/inst/doc/compatibility.html r-cran-dplyr-0.7.8/inst/doc/compatibility.html --- r-cran-dplyr-0.7.6/inst/doc/compatibility.html 2018-06-27 18:49:51.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/doc/compatibility.html 2018-11-09 21:23:31.000000000 +0000 @@ -18,46 +18,256 @@ - + @@ -80,26 +290,26 @@
  • It’s easier on CRAN since it doesn’t require a massive coordinated release of multiple packages.

  • To make code work with multiple versions of a package, your first tool is the simple if statement:

    -
    if (utils::packageVersion("dplyr") > "0.5.0") {
    -  # code for new version
    -} else {
    -  # code for old version
    -}
    +

    Always condition on > current-version, not >= next-version because this will ensure that this branch is also used for the development version of the package. For example, if the current release is version “0.5.0”, the development version will be “0.5.0.9000”.

    Occasionally, you’ll run into a situation where the NAMESPACE has changed and you need to conditionally import different functions. This typically occurs when functions are moved from one package to another. We try out best to provide automatic fallbacks, but this is not always possible. Often you can work around the problem by avoiding importFrom and using :: instead. Do this where possible:

    -
    if (utils::packageVersion("dplyr") > "0.5.0") {
    -  dbplyr::build_sql(...)
    -} else {
    -  dplyr::build_sql(...)
    -}
    +

    This will generate an R CMD check NOTE (because the one of the functions will always be missing), but this is ok. Simply explain that you get the note because you have written a wrapper to make sure your code is backward compatible.

    Sometimes it’s not possible to avoid importFrom(). For example you might be importing a generic so that you can define a method for it. In this case, you can take advantage of a little-known feature in the NAMESPACE file: you can include if statements.

    -
    #' @rawNamespace
    -#' if (utils::packageVersion("dplyr") > "0.5.0") {
    -#'   importFrom("dbplyr", "build_sql")
    -#' } else {
    -#'   importFrom("dplyr", "build_sql")
    -#' }
    +

    dplyr 0.6.0

    @@ -107,9 +317,9 @@

    Database code moves to dbplyr

    Almost all database related code has been moved out of dplyr and into a new package, dbplyr. This makes dplyr simpler, and will make it easier to release fixes for bugs that only affect databases. If you’ve implemented a database backend for dplyr, please read the backend news on the backend.

    Depending on what generics you use, and what generics you provide methods for you, you may need to write some conditional code. To help make this easier we’ve written wrap_dbplyr_obj() which will write the helper code for you:

    -
    wrap_dbplyr_obj("build_sql")
    -
    -wrap_dbplyr_obj("base_agg")
    +

    Simply copy the results of this function in your package.

    These will generate R CMD check NOTES, so make sure to tell CRAN that this is to ensure backward compatibility.

    @@ -119,102 +329,108 @@

    For users of SE_ verbs

    The legacy underscored versions take objects for which a lazyeval::as.lazy() method is defined. This includes symbols and calls, strings, and formulas. All of these objects have been replaced with quosures and you can call tidyeval verbs with unquoted quosures:

    -
    quo <- quo(cyl)
    -select(mtcars, !! quo)
    +

    Symbolic expressions are also supported, but note that bare symbols and calls do not carry scope information. If you’re referring to objects in the data frame, it’s safe to omit specifying an enclosure:

    -
    sym <- quote(cyl)
    -select(mtcars, !! sym)
    -
    -call <- quote(mean(cyl))
    -summarise(mtcars, !! call)
    +

    Transforming objects into quosures is generally straightforward. To enclose with the current environment, you can unquote directly in quo() or you can use as_quosure():

    -
    quo(!! sym)
    -#> <quosure>
    -#>   expr: ^cyl
    -#>   env:  global
    -quo(!! call)
    -#> <quosure>
    -#>   expr: ^mean(cyl)
    -#>   env:  global
    -
    -rlang::as_quosure(sym)
    -#> <quosure>
    -#>   expr: ^cyl
    -#>   env:  global
    -rlang::as_quosure(call)
    -#> <quosure>
    -#>   expr: ^mean(cyl)
    -#>   env:  global
    +

    Note that while formulas and quosures are very similar objects (and in the most general sense, formulas are quosures), they can’t be used interchangeably in tidyeval functions. Early implementations did treat bare formulas as quosures, but this created compatibility issues with modelling functions of the stats package. Fortunately, it’s easy to transform formulas to quosures that will self-evaluate in tidyeval functions:

    -
    f <- ~cyl
    -f
    -#> ~cyl
    -rlang::as_quosure(f)
    -#> <quosure>
    -#>   expr: ^cyl
    -#>   env:  global
    +

    Finally, and perhaps most importantly, strings are not and should not be parsed. As developers, it is tempting to try and solve problems using strings because we have been trained to work with strings rather than quoted expressions. However it’s almost always the wrong way to approach the problem. The exception is for creating symbols. In that case it is perfectly legitimate to use strings:

    -
    rlang::sym("cyl")
    -#> cyl
    -rlang::syms(letters[1:3])
    -#> [[1]]
    -#> a
    -#> 
    -#> [[2]]
    -#> b
    -#> 
    -#> [[3]]
    -#> c
    +

    But you should never use strings to create calls. Instead you can use quasiquotation:

    -
    syms <- rlang::syms(c("foo", "bar", "baz"))
    -quo(my_call(!!! syms))
    -#> <quosure>
    -#>   expr: ^my_call(foo, bar, baz)
    -#>   env:  global
    -
    -fun <- rlang::sym("my_call")
    -quo((!!fun)(!!! syms))
    -#> <quosure>
    -#>   expr: ^my_call(foo, bar, baz)
    -#>   env:  global
    +

    Or create the call with lang():

    -
    call <- rlang::lang("my_call", !!! syms)
    -call
    -#> my_call(foo, bar, baz)
    -
    -rlang::as_quosure(call)
    -#> <quosure>
    -#>   expr: ^my_call(foo, bar, baz)
    -#>   env:  global
    -
    -# Or equivalently:
    -quo(!! rlang::lang("my_call", !!! syms))
    -#> <quosure>
    -#>   expr: ^my_call(foo, bar, baz)
    -#>   env:  global
    +

    Note that idioms based on interp() should now generally be avoided and replaced with quasiquotation. Where you used to interpolate:

    -
    lazyeval::interp(~ mean(var), var = rlang::sym("mpg"))
    +

    You would now unquote:

    -
    var <- "mpg"
    -quo(mean(!! rlang::sym(var)))
    +

    See also vignette("programming") for more about quasiquotation and quosures.

    For package authors

    For package authors, rlang provides a compatibility file that you can copy to your package. compat_lazy() and compat_lazy_dots() turn lazy-able objects into proper quosures. This helps providing an underscored version to your users for backward compatibility. For instance, here is how we defined the underscored version of filter() in dplyr 0.6:

    -
    filter_.tbl_df <- function(.data, ..., .dots = list()) {
    -  dots <- compat_lazy_dots(.dots, caller_env(), ...)
    -  filter(.data, !!! dots)
    -}
    +

    With tidyeval, S3 dispatch to the correct method might be an issue. In the past, the genericity of dplyr verbs was accomplished by dispatching in the underscored versions. Now that those are deprecated, we’ve turned the non-underscored verbs into S3 generics.

    We maintain backward compatibility by redispatching to old underscored verbs in the default methods of the new S3 generics. For example, here is how we redispatch filter():

    -
    filter.default <- function(.data, ...) {
    -  filter_(.data, .dots = compat_as_lazy_dots(...))
    -}
    +

    This gets the job done in most cases. However, the default method will not be called for objects inheriting from one of the classes for which we provide non-underscored methods: data.frame, tbl_df, tbl_cube and grouped_df. An example of this is the sf package whose objects have classes c("sf", "data.frame"). Authors of such packages should provide a method for the non-underscored generic in order to be compatible with dplyr:

    -
    filter.sf <- function(.data, ...) {
    -  st_as_sf(NextMethod())
    -}
    +

    If you need help with this, please let us know!

    @@ -222,17 +438,17 @@

    Deprecation of mutate_each() and summarise_each()

    These functions have been replaced by a more complete family of functions. This family has suffixes _if, _at and _all and includes more verbs than just mutate summarise.

    If you need to update your code to the new family, there are two relevant functions depending on which variables you apply funs() to. If you called mutate_each() without supplying a selection of variables, funs is applied to all variables. In this case, you should update your code to use mutate_all() instead:

    -
    mutate_each(starwars, funs(as.character))
    -mutate_all(starwars, funs(as.character))
    +

    Note that the new verbs support bare functions as well, so you don’t necessarily need to wrap with funs():

    -
    mutate_all(starwars, as.character)
    +

    On the other hand, if you supplied a variable selection, you should use mutate_at(). The variable selection should be wrapped with vars().

    -
    mutate_each(starwars, funs(as.character), height, mass)
    -mutate_at(starwars, vars(height, mass), as.character)
    +

    vars() supports all the selection helpers that you usually use with select():

    -
    summarise_at(mtcars, vars(starts_with("d")), mean)
    +

    Note that intead of a vars() selection, you can also supply character vectors of column names:

    -
    mutate_at(starwars, c("height", "mass"), as.character)
    + diff -Nru r-cran-dplyr-0.7.6/inst/doc/dplyr.html r-cran-dplyr-0.7.8/inst/doc/dplyr.html --- r-cran-dplyr-0.7.6/inst/doc/dplyr.html 2018-06-27 18:49:58.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/doc/dplyr.html 2018-11-09 21:23:36.000000000 +0000 @@ -18,46 +18,256 @@ - + @@ -86,21 +296,21 @@

    Data: nycflights13

    To explore the basic data manipulation verbs of dplyr, we’ll use nycflights13::flights. This dataset contains all 336776 flights that departed from New York City in 2013. The data comes from the US Bureau of Transportation Statistics, and is documented in ?nycflights13

    -
    library(nycflights13)
    -dim(flights)
    -#> [1] 336776     19
    -flights
    -#> # A tibble: 336,776 x 19
    -#>    year month   day dep_time sched_dep_time dep_delay arr_time
    -#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
    -#> 1  2013     1     1      517            515         2      830
    -#> 2  2013     1     1      533            529         4      850
    -#> 3  2013     1     1      542            540         2      923
    -#> 4  2013     1     1      544            545        -1     1004
    -#> # ... with 336,772 more rows, and 12 more variables: sched_arr_time <int>,
    -#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
    -#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
    -#> #   minute <dbl>, time_hour <dttm>
    +

    Note that nycflights13::flights is a tibble, a modern reimagining of the data frame. It’s particularly useful for large datasets because it only prints the first few rows. You can learn more about tibbles at http://tibble.tidyverse.org; in particular you can convert data frames to tibbles with as_tibble().

    @@ -118,197 +328,197 @@

    Filter rows with filter()

    filter() allows you to select a subset of rows in a data frame. Like all single verbs, the first argument is the tibble (or data frame). The second and subsequent arguments refer to variables within that data frame, selecting rows where the expression is TRUE.

    For example, we can select all flights on January 1st with:

    -
    filter(flights, month == 1, day == 1)
    -#> # A tibble: 842 x 19
    -#>    year month   day dep_time sched_dep_time dep_delay arr_time
    -#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
    -#> 1  2013     1     1      517            515         2      830
    -#> 2  2013     1     1      533            529         4      850
    -#> 3  2013     1     1      542            540         2      923
    -#> 4  2013     1     1      544            545        -1     1004
    -#> # ... with 838 more rows, and 12 more variables: sched_arr_time <int>,
    -#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
    -#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
    -#> #   minute <dbl>, time_hour <dttm>
    +

    This is rougly equivalent to this base R code:

    -
    flights[flights$month == 1 & flights$day == 1, ]
    +

    Arrange rows with arrange()

    arrange() works similarly to filter() except that instead of filtering or selecting rows, it reorders them. It takes a data frame, and a set of column names (or more complicated expressions) to order by. If you provide more than one column name, each additional column will be used to break ties in the values of preceding columns:

    -
    arrange(flights, year, month, day)
    -#> # A tibble: 336,776 x 19
    -#>    year month   day dep_time sched_dep_time dep_delay arr_time
    -#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
    -#> 1  2013     1     1      517            515         2      830
    -#> 2  2013     1     1      533            529         4      850
    -#> 3  2013     1     1      542            540         2      923
    -#> 4  2013     1     1      544            545        -1     1004
    -#> # ... with 336,772 more rows, and 12 more variables: sched_arr_time <int>,
    -#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
    -#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
    -#> #   minute <dbl>, time_hour <dttm>
    +

    Use desc() to order a column in descending order:

    -
    arrange(flights, desc(arr_delay))
    -#> # A tibble: 336,776 x 19
    -#>    year month   day dep_time sched_dep_time dep_delay arr_time
    -#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
    -#> 1  2013     1     9      641            900      1301     1242
    -#> 2  2013     6    15     1432           1935      1137     1607
    -#> 3  2013     1    10     1121           1635      1126     1239
    -#> 4  2013     9    20     1139           1845      1014     1457
    -#> # ... with 336,772 more rows, and 12 more variables: sched_arr_time <int>,
    -#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
    -#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
    -#> #   minute <dbl>, time_hour <dttm>
    +

    Select columns with select()

    Often you work with large datasets with many columns but only a few are actually of interest to you. select() allows you to rapidly zoom in on a useful subset using operations that usually only work on numeric variable positions:

    -
    # Select columns by name
    -select(flights, year, month, day)
    -#> # A tibble: 336,776 x 3
    -#>    year month   day
    -#>   <int> <int> <int>
    -#> 1  2013     1     1
    -#> 2  2013     1     1
    -#> 3  2013     1     1
    -#> 4  2013     1     1
    -#> # ... with 336,772 more rows
    -# Select all columns between year and day (inclusive)
    -select(flights, year:day)
    -#> # A tibble: 336,776 x 3
    -#>    year month   day
    -#>   <int> <int> <int>
    -#> 1  2013     1     1
    -#> 2  2013     1     1
    -#> 3  2013     1     1
    -#> 4  2013     1     1
    -#> # ... with 336,772 more rows
    -# Select all columns except those from year to day (inclusive)
    -select(flights, -(year:day))
    -#> # A tibble: 336,776 x 16
    -#>   dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay
    -#>      <int>          <int>     <dbl>    <int>          <int>     <dbl>
    -#> 1      517            515         2      830            819        11
    -#> 2      533            529         4      850            830        20
    -#> 3      542            540         2      923            850        33
    -#> 4      544            545        -1     1004           1022       -18
    -#> # ... with 336,772 more rows, and 10 more variables: carrier <chr>,
    -#> #   flight <int>, tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>,
    -#> #   distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>
    +

    There are a number of helper functions you can use within select(), like starts_with(), ends_with(), matches() and contains(). These let you quickly match larger blocks of variables that meet some criterion. See ?select for more details.

    You can rename variables with select() by using named arguments:

    -
    select(flights, tail_num = tailnum)
    -#> # A tibble: 336,776 x 1
    -#>   tail_num
    -#>   <chr>   
    -#> 1 N14228  
    -#> 2 N24211  
    -#> 3 N619AA  
    -#> 4 N804JB  
    -#> # ... with 336,772 more rows
    +

    But because select() drops all the variables not explicitly mentioned, it’s not that useful. Instead, use rename():

    -
    rename(flights, tail_num = tailnum)
    -#> # A tibble: 336,776 x 19
    -#>    year month   day dep_time sched_dep_time dep_delay arr_time
    -#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
    -#> 1  2013     1     1      517            515         2      830
    -#> 2  2013     1     1      533            529         4      850
    -#> 3  2013     1     1      542            540         2      923
    -#> 4  2013     1     1      544            545        -1     1004
    -#> # ... with 336,772 more rows, and 12 more variables: sched_arr_time <int>,
    -#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tail_num <chr>,
    -#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
    -#> #   minute <dbl>, time_hour <dttm>
    +

    Add new columns with mutate()

    Besides selecting sets of existing columns, it’s often useful to add new columns that are functions of existing columns. This is the job of mutate():

    -
    mutate(flights,
    -  gain = arr_delay - dep_delay,
    -  speed = distance / air_time * 60
    -)
    -#> # A tibble: 336,776 x 21
    -#>    year month   day dep_time sched_dep_time dep_delay arr_time
    -#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
    -#> 1  2013     1     1      517            515         2      830
    -#> 2  2013     1     1      533            529         4      850
    -#> 3  2013     1     1      542            540         2      923
    -#> 4  2013     1     1      544            545        -1     1004
    -#> # ... with 336,772 more rows, and 14 more variables: sched_arr_time <int>,
    -#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
    -#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
    -#> #   minute <dbl>, time_hour <dttm>, gain <dbl>, speed <dbl>
    +

    dplyr::mutate() is similar to the base transform(), but allows you to refer to columns that you’ve just created:

    -
    mutate(flights,
    -  gain = arr_delay - dep_delay,
    -  gain_per_hour = gain / (air_time / 60)
    -)
    -#> # A tibble: 336,776 x 21
    -#>    year month   day dep_time sched_dep_time dep_delay arr_time
    -#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
    -#> 1  2013     1     1      517            515         2      830
    -#> 2  2013     1     1      533            529         4      850
    -#> 3  2013     1     1      542            540         2      923
    -#> 4  2013     1     1      544            545        -1     1004
    -#> # ... with 336,772 more rows, and 14 more variables: sched_arr_time <int>,
    -#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
    -#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
    -#> #   minute <dbl>, time_hour <dttm>, gain <dbl>, gain_per_hour <dbl>
    +

    If you only want to keep the new variables, use transmute():

    -
    transmute(flights,
    -  gain = arr_delay - dep_delay,
    -  gain_per_hour = gain / (air_time / 60)
    -)
    -#> # A tibble: 336,776 x 2
    -#>    gain gain_per_hour
    -#>   <dbl>         <dbl>
    -#> 1     9          2.38
    -#> 2    16          4.23
    -#> 3    31         11.6 
    -#> 4   -17         -5.57
    -#> # ... with 336,772 more rows
    +

    Summarise values with summarise()

    The last verb is summarise(). It collapses a data frame to a single row.

    -
    summarise(flights,
    -  delay = mean(dep_delay, na.rm = TRUE)
    -)
    -#> # A tibble: 1 x 1
    -#>   delay
    -#>   <dbl>
    -#> 1  12.6
    +

    It’s not that useful until we learn the group_by() verb below.

    Randomly sample rows with sample_n() and sample_frac()

    You can use sample_n() and sample_frac() to take a random sample of rows: use sample_n() for a fixed number and sample_frac() for a fixed fraction.

    -
    sample_n(flights, 10)
    -#> # A tibble: 10 x 19
    -#>    year month   day dep_time sched_dep_time dep_delay arr_time
    -#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
    -#> 1  2013    10     1      822            825        -3      932
    -#> 2  2013     8     2      712            715        -3     1015
    -#> 3  2013     5    10     1309           1315        -6     1502
    -#> 4  2013    10    28     2002           1930        32     2318
    -#> # ... with 6 more rows, and 12 more variables: sched_arr_time <int>,
    -#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
    -#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
    -#> #   minute <dbl>, time_hour <dttm>
    -sample_frac(flights, 0.01)
    -#> # A tibble: 3,368 x 19
    -#>    year month   day dep_time sched_dep_time dep_delay arr_time
    -#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
    -#> 1  2013     8    16      827            830        -3      928
    -#> 2  2013    11     4     1306           1300         6     1639
    -#> 3  2013     1    14      929            935        -6     1213
    -#> 4  2013    12    28      625            630        -5      916
    -#> # ... with 3,364 more rows, and 12 more variables: sched_arr_time <int>,
    -#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
    -#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
    -#> #   minute <dbl>, time_hour <dttm>
    +

    Use replace = TRUE to perform a bootstrap sample. If needed, you can weight the sample with the weight argument.

    @@ -338,20 +548,20 @@
  • summarise() computes the summary for each group.

  • In the following example, we split the complete dataset into individual planes and then summarise each plane by counting the number of flights (count = n()) and computing the average distance (dist = mean(distance, na.rm = TRUE)) and arrival delay (delay = mean(arr_delay, na.rm = TRUE)). We then use ggplot2 to display the output.

    -
    by_tailnum <- group_by(flights, tailnum)
    -delay <- summarise(by_tailnum,
    -  count = n(),
    -  dist = mean(distance, na.rm = TRUE),
    -  delay = mean(arr_delay, na.rm = TRUE))
    -delay <- filter(delay, count > 20, dist < 2000)
    -
    -# Interestingly, the average delay is only slightly related to the
    -# average distance flown by a plane.
    -ggplot(delay, aes(dist, delay)) +
    -  geom_point(aes(size = count), alpha = 1/2) +
    -  geom_smooth() +
    -  scale_size_area()
    -

    + +

    You use summarise() with aggregate functions, which take a vector of values and return a single number. There are many useful examples of such functions in base R like min(), max(), mean(), sum(), sd(), median(), and IQR(). dplyr provides a handful of others:

    • n(): the number of observations in the current group

    • @@ -359,257 +569,257 @@
    • first(x), last(x) and nth(x, n) - these work similarly to x[1], x[length(x)], and x[n] but give you more control over the result if the value is missing.

    For example, we could use these to find the number of planes and the number of flights that go to each possible destination:

    -
    destinations <- group_by(flights, dest)
    -summarise(destinations,
    -  planes = n_distinct(tailnum),
    -  flights = n()
    -)
    -#> # A tibble: 105 x 3
    -#>   dest  planes flights
    -#>   <chr>  <int>   <int>
    -#> 1 ABQ      108     254
    -#> 2 ACK       58     265
    -#> 3 ALB      172     439
    -#> 4 ANC        6       8
    -#> # ... with 101 more rows
    +

    When you group by multiple variables, each summary peels off one level of the grouping. That makes it easy to progressively roll-up a dataset:

    -
    daily <- group_by(flights, year, month, day)
    -(per_day   <- summarise(daily, flights = n()))
    -#> # A tibble: 365 x 4
    -#> # Groups:   year, month [?]
    -#>    year month   day flights
    -#>   <int> <int> <int>   <int>
    -#> 1  2013     1     1     842
    -#> 2  2013     1     2     943
    -#> 3  2013     1     3     914
    -#> 4  2013     1     4     915
    -#> # ... with 361 more rows
    -(per_month <- summarise(per_day, flights = sum(flights)))
    -#> # A tibble: 12 x 3
    -#> # Groups:   year [?]
    -#>    year month flights
    -#>   <int> <int>   <int>
    -#> 1  2013     1   27004
    -#> 2  2013     2   24951
    -#> 3  2013     3   28834
    -#> 4  2013     4   28330
    -#> # ... with 8 more rows
    -(per_year  <- summarise(per_month, flights = sum(flights)))
    -#> # A tibble: 1 x 2
    -#>    year flights
    -#>   <int>   <int>
    -#> 1  2013  336776
    +

    However you need to be careful when progressively rolling up summaries like this: it’s ok for sums and counts, but you need to think about weighting for means and variances (it’s not possible to do this exactly for medians).

    Selecting operations

    One of the appealing features of dplyr is that you can refer to columns from the tibble as if they were regular variables. However, the syntactic uniformity of referring to bare column names hides semantical differences across the verbs. A column symbol supplied to select() does not have the same meaning as the same symbol supplied to mutate().

    Selecting operations expect column names and positions. Hence, when you call select() with bare variable names, they actually represent their own positions in the tibble. The following calls are completely equivalent from dplyr’s point of view:

    -
    # `year` represents the integer 1
    -select(flights, year)
    -#> # A tibble: 336,776 x 1
    -#>    year
    -#>   <int>
    -#> 1  2013
    -#> 2  2013
    -#> 3  2013
    -#> 4  2013
    -#> # ... with 336,772 more rows
    -select(flights, 1)
    -#> # A tibble: 336,776 x 1
    -#>    year
    -#>   <int>
    -#> 1  2013
    -#> 2  2013
    -#> 3  2013
    -#> 4  2013
    -#> # ... with 336,772 more rows
    +

    By the same token, this means that you cannot refer to variables from the surrounding context if they have the same name as one of the columns. In the following example, year still represents 1, not 5:

    -
    year <- 5
    -select(flights, year)
    +

    One useful subtlety is that this only applies to bare names and to selecting calls like c(year, month, day) or year:day. In all other cases, the columns of the data frame are not put in scope. This allows you to refer to contextual variables in selection helpers:

    -
    year <- "dep"
    -select(flights, starts_with(year))
    -#> # A tibble: 336,776 x 2
    -#>   dep_time dep_delay
    -#>      <int>     <dbl>
    -#> 1      517         2
    -#> 2      533         4
    -#> 3      542         2
    -#> 4      544        -1
    -#> # ... with 336,772 more rows
    +

    These semantics are usually intuitive. But note the subtle difference:

    -
    year <- 5
    -select(flights, year, identity(year))
    -#> # A tibble: 336,776 x 2
    -#>    year sched_dep_time
    -#>   <int>          <int>
    -#> 1  2013            515
    -#> 2  2013            529
    -#> 3  2013            540
    -#> 4  2013            545
    -#> # ... with 336,772 more rows
    +

    In the first argument, year represents its own position 1. In the second argument, year is evaluated in the surrounding context and represents the fifth column.

    For a long time, select() used to only understand column positions. Counting from dplyr 0.6, it now understands column names as well. This makes it a bit easier to program with select():

    -
    vars <- c("year", "month")
    -select(flights, vars, "day")
    -#> # A tibble: 336,776 x 3
    -#>    year month   day
    -#>   <int> <int> <int>
    -#> 1  2013     1     1
    -#> 2  2013     1     1
    -#> 3  2013     1     1
    -#> 4  2013     1     1
    -#> # ... with 336,772 more rows
    +

    Note that the code above is somewhat unsafe because you might have added a column named vars to the tibble, or you might apply the code to another data frame containing such a column. To avoid this issue, you can wrap the variable in an identity() call as we mentioned above, as this will bypass column names. However, a more explicit and general method that works in all dplyr verbs is to unquote the variable with the !! operator. This tells dplyr to bypass the data frame and to directly look in the context:

    -
    # Let's create a new `vars` column:
    -flights$vars <- flights$year
    -
    -# The new column won't be an issue if you evaluate `vars` in the
    -# context with the `!!` operator:
    -vars <- c("year", "month", "day")
    -select(flights, !! vars)
    -#> # A tibble: 336,776 x 3
    -#>    year month   day
    -#>   <int> <int> <int>
    -#> 1  2013     1     1
    -#> 2  2013     1     1
    -#> 3  2013     1     1
    -#> 4  2013     1     1
    -#> # ... with 336,772 more rows
    +

    This operator is very useful when you need to use dplyr within custom functions. You can learn more about it in vignette("programming"). However it is important to understand the semantics of the verbs you are unquoting into, that is, the values they understand. As we have just seen, select() supports names and positions of columns. But that won’t be the case in other verbs like mutate() because they have different semantics.

    Mutating operations

    Mutate semantics are quite different from selection semantics. Whereas select() expects column names or positions, mutate() expects column vectors. Let’s create a smaller tibble for clarity:

    -
    df <- select(flights, year:dep_time)
    +

    When we use select(), the bare column names stand for ther own positions in the tibble. For mutate() on the other hand, column symbols represent the actual column vectors stored in the tibble. Consider what happens if we give a string or a number to mutate():

    -
    mutate(df, "year", 2)
    -#> # A tibble: 336,776 x 6
    -#>    year month   day dep_time `"year"`   `2`
    -#>   <int> <int> <int>    <int> <chr>    <dbl>
    -#> 1  2013     1     1      517 year         2
    -#> 2  2013     1     1      533 year         2
    -#> 3  2013     1     1      542 year         2
    -#> 4  2013     1     1      544 year         2
    -#> # ... with 336,772 more rows
    +

    mutate() gets length-1 vectors that it interprets as new columns in the data frame. These vectors are recycled so they match the number of rows. That’s why it doesn’t make sense to supply expressions like "year" + 10 to mutate(). This amounts to adding 10 to a string! The correct expression is:

    -
    mutate(df, year + 10)
    -#> # A tibble: 336,776 x 5
    -#>    year month   day dep_time `year + 10`
    -#>   <int> <int> <int>    <int>       <dbl>
    -#> 1  2013     1     1      517        2023
    -#> 2  2013     1     1      533        2023
    -#> 3  2013     1     1      542        2023
    -#> 4  2013     1     1      544        2023
    -#> # ... with 336,772 more rows
    +

    In the same way, you can unquote values from the context if these values represent a valid column. They must be either length 1 (they then get recycled) or have the same length as the number of rows. In the following example we create a new vector that we add to the data frame:

    -
    var <- seq(1, nrow(df))
    -mutate(df, new = var)
    -#> # A tibble: 336,776 x 5
    -#>    year month   day dep_time   new
    -#>   <int> <int> <int>    <int> <int>
    -#> 1  2013     1     1      517     1
    -#> 2  2013     1     1      533     2
    -#> 3  2013     1     1      542     3
    -#> 4  2013     1     1      544     4
    -#> # ... with 336,772 more rows
    +

    A case in point is group_by(). While you might think it has select semantics, it actually has mutate semantics. This is quite handy as it allows to group by a modified column:

    -
    group_by(df, month)
    -#> # A tibble: 336,776 x 4
    -#> # Groups:   month [12]
    -#>    year month   day dep_time
    -#>   <int> <int> <int>    <int>
    -#> 1  2013     1     1      517
    -#> 2  2013     1     1      533
    -#> 3  2013     1     1      542
    -#> 4  2013     1     1      544
    -#> # ... with 336,772 more rows
    -group_by(df, month = as.factor(month))
    -#> # A tibble: 336,776 x 4
    -#> # Groups:   month [12]
    -#>    year month   day dep_time
    -#>   <int> <fct> <int>    <int>
    -#> 1  2013 1         1      517
    -#> 2  2013 1         1      533
    -#> 3  2013 1         1      542
    -#> 4  2013 1         1      544
    -#> # ... with 336,772 more rows
    -group_by(df, day_binned = cut(day, 3))
    -#> # A tibble: 336,776 x 5
    -#> # Groups:   day_binned [3]
    -#>    year month   day dep_time day_binned
    -#>   <int> <int> <int>    <int> <fct>     
    -#> 1  2013     1     1      517 (0.97,11] 
    -#> 2  2013     1     1      533 (0.97,11] 
    -#> 3  2013     1     1      542 (0.97,11] 
    -#> 4  2013     1     1      544 (0.97,11] 
    -#> # ... with 336,772 more rows
    +

    This is why you can’t supply a column name to group_by(). This amounts to creating a new column containing the string recycled to the number of rows:

    -
    group_by(df, "month")
    -#> # A tibble: 336,776 x 5
    -#> # Groups:   "month" [1]
    -#>    year month   day dep_time `"month"`
    -#>   <int> <int> <int>    <int> <chr>    
    -#> 1  2013     1     1      517 month    
    -#> 2  2013     1     1      533 month    
    -#> 3  2013     1     1      542 month    
    -#> 4  2013     1     1      544 month    
    -#> # ... with 336,772 more rows
    +

    Since grouping with select semantics can be sometimes useful as well, we have added the group_by_at() variant. In dplyr, variants suffixed with _at() support selection semantics in their second argument. You just need to wrap the selection with vars():

    -
    group_by_at(df, vars(year:day))
    -#> # A tibble: 336,776 x 4
    -#> # Groups:   year, month, day [365]
    -#>    year month   day dep_time
    -#>   <int> <int> <int>    <int>
    -#> 1  2013     1     1      517
    -#> 2  2013     1     1      533
    -#> 3  2013     1     1      542
    -#> 4  2013     1     1      544
    -#> # ... with 336,772 more rows
    +

    You can read more about the _at() and _if() variants in the ?scoped help page.

    Piping

    The dplyr API is functional in the sense that function calls don’t have side-effects. You must always save their results. This doesn’t lead to particularly elegant code, especially if you want to do many operations at once. You either have to do it step-by-step:

    -
    a1 <- group_by(flights, year, month, day)
    -a2 <- select(a1, arr_delay, dep_delay)
    -a3 <- summarise(a2,
    -  arr = mean(arr_delay, na.rm = TRUE),
    -  dep = mean(dep_delay, na.rm = TRUE))
    -a4 <- filter(a3, arr > 30 | dep > 30)
    +

    Or if you don’t want to name the intermediate results, you need to wrap the function calls inside each other:

    -
    filter(
    -  summarise(
    -    select(
    -      group_by(flights, year, month, day),
    -      arr_delay, dep_delay
    -    ),
    -    arr = mean(arr_delay, na.rm = TRUE),
    -    dep = mean(dep_delay, na.rm = TRUE)
    -  ),
    -  arr > 30 | dep > 30
    -)
    -#> Adding missing grouping variables: `year`, `month`, `day`
    -#> # A tibble: 49 x 5
    -#> # Groups:   year, month [11]
    -#>    year month   day   arr   dep
    -#>   <int> <int> <int> <dbl> <dbl>
    -#> 1  2013     1    16  34.2  24.6
    -#> 2  2013     1    31  32.6  28.7
    -#> 3  2013     2    11  36.3  39.1
    -#> 4  2013     2    27  31.3  37.8
    -#> # ... with 45 more rows
    +

    This is difficult to read because the order of the operations is from inside to out. Thus, the arguments are a long way away from the function. To get around this problem, dplyr provides the %>% operator from magrittr. x %>% f(y) turns into f(x, y) so you can use it to rewrite multiple operations that you can read left-to-right, top-to-bottom:

    -
    flights %>%
    -  group_by(year, month, day) %>%
    -  select(arr_delay, dep_delay) %>%
    -  summarise(
    -    arr = mean(arr_delay, na.rm = TRUE),
    -    dep = mean(dep_delay, na.rm = TRUE)
    -  ) %>%
    -  filter(arr > 30 | dep > 30)
    +

    Other data sources

    diff -Nru r-cran-dplyr-0.7.6/inst/doc/programming.html r-cran-dplyr-0.7.8/inst/doc/programming.html --- r-cran-dplyr-0.7.6/inst/doc/programming.html 2018-06-27 18:50:00.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/doc/programming.html 2018-11-09 21:23:37.000000000 +0000 @@ -18,46 +18,256 @@ - + @@ -78,28 +288,28 @@

    Unfortunately these benefits do not come for free. There are two main drawbacks:

    Fortunately, dplyr provides tools to overcome these challenges. They require a little more typing, but a small amount of upfront work is worth it because they help you save time in the long run.

    @@ -112,23 +322,23 @@

    Warm up

    You might not have realised it, but you’re already accomplished at solving this type of problem in another domain: strings. It’s obvious that this function doesn’t do what you want:

    -
    greet <- function(name) {
    -  "How do you do, name?"
    -}
    -greet("Hadley")
    -#> [1] "How do you do, name?"
    +

    That’s because " “quotes” its input: it doesn’t interpret what you’ve typed, it just stores it in a string. One way to make the function do what you want is to use paste() to build up the string piece by piece:

    -
    greet <- function(name) {
    -  paste0("How do you do, ", name, "?")
    -}
    -greet("Hadley")
    -#> [1] "How do you do, Hadley?"
    +

    Another approach is exemplified by the glue package: it allows you to “unquote” components of a string, replacing the string with the value of the R expression. This allows an elegant implementation of our function because {name} is replaced with the value of the name argument.

    -
    greet <- function(name) {
    -  glue::glue("How do you do, {name}?")
    -}
    -greet("Hadley")
    -#> How do you do, Hadley?
    +

    Programming recipes

    @@ -136,245 +346,245 @@

    Different data sets

    You already know how to write functions that work with the first argument of dplyr verbs: the data. That’s because dplyr doesn’t do anything special with that argument, so it’s referentially transparent. For example, if you saw repeated code like this:

    -
    mutate(df1, y = a + x)
    -mutate(df2, y = a + x)
    -mutate(df3, y = a + x)
    -mutate(df4, y = a + x)
    +

    You could already write a function to capture that duplication:

    -
    mutate_y <- function(df) {
    -  mutate(df, y = a + x)
    -}
    +

    Unfortunately, there’s a drawback to this simple approach: it can fail silently if one of the variables isn’t present in the data frame, but is present in the global environment.

    -
    df1 <- tibble(x = 1:3)
    -a <- 10
    -mutate_y(df1)
    -#> # A tibble: 3 x 2
    -#>       x     y
    -#>   <int> <dbl>
    -#> 1     1    11
    -#> 2     2    12
    -#> 3     3    13
    +

    We can fix that ambiguity by being more explicit and using the .data pronoun. This will throw an informative error if the variable doesn’t exist:

    -
    mutate_y <- function(df) {
    -  mutate(df, y = .data$a + .data$x)
    -}
    -
    -mutate_y(df1)
    -#> Error in mutate_impl(.data, dots): Evaluation error: Column `a` not found in `.data`.
    +

    If this function is in a package, using .data also prevents R CMD check from giving a NOTE about undefined global variables (provided that you’ve also imported rlang::.data with @importFrom rlang .data).

    Different expressions

    Writing a function is hard if you want one of the arguments to be a variable name (like x) or an expression (like x + y). That’s because dplyr automatically “quotes” those inputs, so they are not referentially transparent. Let’s start with a simple case: you want to vary the grouping variable for a data summarization.

    -
    df <- tibble(
    -  g1 = c(1, 1, 2, 2, 2),
    -  g2 = c(1, 2, 1, 2, 1),
    -  a = sample(5),
    -  b = sample(5)
    -)
    -
    -df %>%
    -  group_by(g1) %>%
    -  summarise(a = mean(a))
    -#> # A tibble: 2 x 2
    -#>      g1     a
    -#>   <dbl> <dbl>
    -#> 1     1  2.5 
    -#> 2     2  3.33
    -
    -df %>%
    -  group_by(g2) %>%
    -  summarise(a = mean(a))
    -#> # A tibble: 2 x 2
    -#>      g2     a
    -#>   <dbl> <dbl>
    -#> 1     1   2  
    -#> 2     2   4.5
    +

    You might hope that this will work:

    -
    my_summarise <- function(df, group_var) {
    -  df %>%
    -    group_by(group_var) %>%
    -    summarise(a = mean(a))
    -}
    -
    -my_summarise(df, g1)
    -#> Error in grouped_df_impl(data, unname(vars), drop): Column `group_var` is unknown
    +

    But it doesn’t.

    Maybe providing the variable name as a string will fix things?

    -
    my_summarise(df, "g2")
    -#> Error in grouped_df_impl(data, unname(vars), drop): Column `group_var` is unknown
    +

    Nope.

    If you look carefully at the error message, you’ll see that it’s the same in both cases. group_by() works like ": it doesn’t evaluate its input; it quotes it.

    To make this function work, we need to do two things. We need to quote the input ourselves (so my_summarise() can take a bare variable name like group_by()), and then we need to tell group_by() not to quote its input (because we’ve done the quoting).

    How do we quote the input? We can’t use "" to quote the input, because that gives us a string. Instead we need a function that captures the expression and its environment (we’ll come back to why this is important later on). There are two possible options we could use in base R, the function quote() and the operator ~. Neither of these work quite the way we want, so we need a new function: quo().

    quo() works like ": it quotes its input rather than evaluating it.

    -
    quo(g1)
    -#> <quosure>
    -#>   expr: ^g1
    -#>   env:  global
    -quo(a + b + c)
    -#> <quosure>
    -#>   expr: ^a + b + c
    -#>   env:  global
    -quo("a")
    -#> <quosure>
    -#>   expr: ^"a"
    -#>   env:  empty
    +

    quo() returns a quosure, which is a special type of formula. You’ll learn more about quosures later on.

    Now that we’ve captured this expression, how do we use it with group_by()? It doesn’t work if we just shove it into our naive approach:

    -
    my_summarise(df, quo(g1))
    -#> Error in grouped_df_impl(data, unname(vars), drop): Column `group_var` is unknown
    +

    We get the same error as before, because we haven’t yet told group_by() that we’re taking care of the quoting. In other words, we need to tell group_by() not to quote its input, because it has been pre-quoted by my_summarise(). Yet another way of saying the same thing is that we want to unquote group_var.

    In dplyr (and in tidyeval in general) you use !! to say that you want to unquote an input so that it’s evaluated, not quoted. This gives us a function that actually does what we want.

    -
    my_summarise <- function(df, group_var) {
    -  df %>%
    -    group_by(!! group_var) %>%
    -    summarise(a = mean(a))
    -}
    -
    -my_summarise(df, quo(g1))
    -#> # A tibble: 2 x 2
    -#>      g1     a
    -#>   <dbl> <dbl>
    -#> 1     1  2.5 
    -#> 2     2  3.33
    +

    Huzzah!

    There’s just one step left: we want to call this function like we call group_by():

    -
    my_summarise(df, g1)
    +

    This doesn’t work because there’s no object called g1. We need to capture what the user of the function typed and quote it for them. You might try using quo() to do that:

    -
    my_summarise <- function(df, group_var) {
    -  quo_group_var <- quo(group_var)
    -  print(quo_group_var)
    -
    -  df %>%
    -    group_by(!! quo_group_var) %>%
    -    summarise(a = mean(a))
    -}
    -
    -my_summarise(df, g1)
    -#> <quosure>
    -#>   expr: ^group_var
    -#>   env:  0x5642173038a0
    -#> Error in grouped_df_impl(data, unname(vars), drop): Column `group_var` is unknown
    +

    I’ve added a print() call to make it obvious what’s going wrong here: quo(group_var) always returns ~group_var. It is being too literal! We want it to substitute the value that the user supplied, i.e. to return ~g1.

    By analogy to strings, we don’t want "", instead we want some function that turns an argument into a string. That’s the job of enquo(). enquo() uses some dark magic to look at the argument, see what the user typed, and return that value as a quosure. (Technically, this works because function arguments are evaluated lazily, using a special data structure called a promise.)

    -
    my_summarise <- function(df, group_var) {
    -  group_var <- enquo(group_var)
    -  print(group_var)
    -
    -  df %>%
    -    group_by(!! group_var) %>%
    -    summarise(a = mean(a))
    -}
    -
    -my_summarise(df, g1)
    -#> <quosure>
    -#>   expr: ^g1
    -#>   env:  global
    -#> # A tibble: 2 x 2
    -#>      g1     a
    -#>   <dbl> <dbl>
    -#> 1     1  2.5 
    -#> 2     2  3.33
    +

    (If you’re familiar with quote() and substitute() in base R, quo() is equivalent to quote() and enquo() is equivalent to substitute().)

    You might wonder how to extend this to handle multiple grouping variables: we’ll come back to that a little later.

    Different input variable

    Now let’s tackle something a bit more complicated. The code below shows a duplicate summarise() statement where we compute three summaries, varying the input variable.

    -
    summarise(df, mean = mean(a), sum = sum(a), n = n())
    -#> # A tibble: 1 x 3
    -#>    mean   sum     n
    -#>   <dbl> <int> <int>
    -#> 1     3    15     5
    -summarise(df, mean = mean(a * b), sum = sum(a * b), n = n())
    -#> # A tibble: 1 x 3
    -#>    mean   sum     n
    -#>   <dbl> <int> <int>
    -#> 1   9.6    48     5
    +

    To turn this into a function, we start by testing the basic approach interactively: we quote the variable with quo(), then unquoting it in the dplyr call with !!. Notice that we can unquote anywhere inside a complicated expression.

    -
    my_var <- quo(a)
    -summarise(df, mean = mean(!! my_var), sum = sum(!! my_var), n = n())
    -#> # A tibble: 1 x 3
    -#>    mean   sum     n
    -#>   <dbl> <int> <int>
    -#> 1     3    15     5
    +

    You can also wrap quo() around the dplyr call to see what will happen from dplyr’s perspective. This is a very useful tool for debugging.

    -
    quo(summarise(df,
    -  mean = mean(!! my_var),
    -  sum = sum(!! my_var),
    -  n = n()
    -))
    -#> <quosure>
    -#>   expr: ^summarise(df, mean = mean(^a), sum = sum(^a), n = n())
    -#>   env:  global
    +

    Now we can turn our code into a function (remembering to replace quo() with enquo()), and check that it works:

    -
    my_summarise2 <- function(df, expr) {
    -  expr <- enquo(expr)
    -
    -  summarise(df,
    -    mean = mean(!! expr),
    -    sum = sum(!! expr),
    -    n = n()
    -  )
    -}
    -my_summarise2(df, a)
    -#> # A tibble: 1 x 3
    -#>    mean   sum     n
    -#>   <dbl> <int> <int>
    -#> 1     3    15     5
    -my_summarise2(df, a * b)
    -#> # A tibble: 1 x 3
    -#>    mean   sum     n
    -#>   <dbl> <int> <int>
    -#> 1   9.6    48     5
    +

    Different input and output variable

    The next challenge is to vary the name of the output variables:

    -
    mutate(df, mean_a = mean(a), sum_a = sum(a))
    -#> # A tibble: 5 x 6
    -#>      g1    g2     a     b mean_a sum_a
    -#>   <dbl> <dbl> <int> <int>  <dbl> <int>
    -#> 1     1     1     1     3      3    15
    -#> 2     1     2     4     2      3    15
    -#> 3     2     1     2     1      3    15
    -#> 4     2     2     5     4      3    15
    -#> # ... with 1 more row
    -mutate(df, mean_b = mean(b), sum_b = sum(b))
    -#> # A tibble: 5 x 6
    -#>      g1    g2     a     b mean_b sum_b
    -#>   <dbl> <dbl> <int> <int>  <dbl> <int>
    -#> 1     1     1     1     3      3    15
    -#> 2     1     2     4     2      3    15
    -#> 3     2     1     2     1      3    15
    -#> 4     2     2     5     4      3    15
    -#> # ... with 1 more row
    +

    This code is similar to the previous example, but there are two new wrinkles:

    • We create the new names by pasting together strings, so we need quo_name() to convert the input expression to a string.

    • -
    • !! mean_name = mean(!! expr) isn’t valid R code, so we need to use the := helper provided by rlang.

    • +
    • !!mean_name = mean(!!expr) isn’t valid R code, so we need to use the := helper provided by rlang.

    -
    my_mutate <- function(df, expr) {
    -  expr <- enquo(expr)
    -  mean_name <- paste0("mean_", quo_name(expr))
    -  sum_name <- paste0("sum_", quo_name(expr))
    -
    -  mutate(df,
    -    !! mean_name := mean(!! expr),
    -    !! sum_name := sum(!! expr)
    -  )
    -}
    -
    -my_mutate(df, a)
    -#> # A tibble: 5 x 6
    -#>      g1    g2     a     b mean_a sum_a
    -#>   <dbl> <dbl> <int> <int>  <dbl> <int>
    -#> 1     1     1     1     3      3    15
    -#> 2     1     2     4     2      3    15
    -#> 3     2     1     2     1      3    15
    -#> 4     2     2     5     4      3    15
    -#> # ... with 1 more row
    +

    Capturing multiple variables

    @@ -384,101 +594,101 @@
  • Use quos() to capture all the ... as a list of formulas.

  • Use !!! instead of !! to splice the arguments into group_by().

  • -
    my_summarise <- function(df, ...) {
    -  group_var <- quos(...)
    -
    -  df %>%
    -    group_by(!!! group_var) %>%
    -    summarise(a = mean(a))
    -}
    -
    -my_summarise(df, g1, g2)
    -#> # A tibble: 4 x 3
    -#> # Groups:   g1 [?]
    -#>      g1    g2     a
    -#>   <dbl> <dbl> <dbl>
    -#> 1     1     1   1  
    -#> 2     1     2   4  
    -#> 3     2     1   2.5
    -#> 4     2     2   5
    +

    !!! takes a list of elements and splices them into to the current call. Look at the bottom of the !!! and think ....

    -
    args <- list(na.rm = TRUE, trim = 0.25)
    -quo(mean(x, !!! args))
    -#> <quosure>
    -#>   expr: ^mean(x, na.rm = TRUE, trim = 0.25)
    -#>   env:  global
    -
    -args <- list(quo(x), na.rm = TRUE, trim = 0.25)
    -quo(mean(!!! args))
    -#> <quosure>
    -#>   expr: ^mean(^x, na.rm = TRUE, trim = 0.25)
    -#>   env:  global
    +

    Now that you’ve learned the basics of tidyeval through some practical examples, we’ll dive into the theory. This will help you generalise what you’ve learned here to new situations.

    Quoting

    Quoting is the action of capturing an expression instead of evaluating it. All expression-based functions quote their arguments and get the R code as an expression rather than the result of evaluating that code. If you are an R user, you probably quote expressions on a regular basis. One of the most important quoting operators in R is the formula. It is famously used for the specification of statistical models:

    -
    disp ~ cyl + drat
    -#> disp ~ cyl + drat
    +

    The other quoting operator in base R is quote(). It returns a raw expression rather than a formula:

    -
    # Computing the value of the expression:
    -toupper(letters[1:5])
    -#> [1] "A" "B" "C" "D" "E"
    -
    -# Capturing the expression:
    -quote(toupper(letters[1:5]))
    -#> toupper(letters[1:5])
    +

    (Note that despite being called the double quote, " is not a quoting operator in this context, because it generates a string, not an expression.)

    In practice, the formula is the better of the two options because it captures the code and its execution environment. This is important because even simple expression can yield different values in different environments. For example, the x in the following two expressions refers to different values:

    -
    f <- function(x) {
    -  quo(x)
    -}
    -
    -x1 <- f(10)
    -x2 <- f(100)
    +

    It might look like the expressions are the same if you print them out.

    -
    x1
    -#> <quosure>
    -#>   expr: ^x
    -#>   env:  0x564211d71e60
    -x2
    -#> <quosure>
    -#>   expr: ^x
    -#>   env:  0x56421194ac90
    +

    But if you inspect the environments using rlang::get_env() — they’re different.

    -
    library(rlang)
    -
    -get_env(x1)
    -#> <environment: 0x564211d71e60>
    -get_env(x2)
    -#> <environment: 0x56421194ac90>
    +

    Further, when we evaluate those formulas using rlang::eval_tidy(), we see that they yield different values:

    -
    eval_tidy(x1)
    -#> [1] 10
    -eval_tidy(x2)
    -#> [1] 100
    +

    This is a key property of R: one name can refer to different values in different environments. This is also important for dplyr, because it allows you to combine variables and objects in a call:

    -
    user_var <- 1000
    -mtcars %>% summarise(cyl = mean(cyl) * user_var)
    -#>      cyl
    -#> 1 6187.5
    +

    When an object keeps track of an environment, it is said to have an enclosure. This is the reason that functions in R are sometimes referred to as closures:

    -
    typeof(mean)
    -#> [1] "closure"
    +

    For this reason we use a special name to refer to one-sided formulas: quosures. One-sided formulas are quotes (they carry an expression) with an environment.

    Quosures are regular R objects. They can be stored in a variable and inspected:

    -
    var <- ~toupper(letters[1:5])
    -var
    -#> ~toupper(letters[1:5])
    -
    -# You can extract its expression:
    -get_expr(var)
    -#> toupper(letters[1:5])
    -
    -# Or inspect its enclosure:
    -get_env(var)
    -#> <environment: R_GlobalEnv>
    +

    Quasiquotation

    @@ -494,125 +704,91 @@

    Unquoting

    -

    The first important operation is the basic unquote, which comes in a functional form, UQ(), and as syntactic-sugar, !!.

    -
    # Here we capture `letters[1:5]` as an expression:
    -quo(toupper(letters[1:5]))
    -#> <quosure>
    -#>   expr: ^toupper(letters[1:5])
    -#>   env:  global
    -
    -# Here we capture the value of `letters[1:5]`
    -quo(toupper(!! letters[1:5]))
    -#> <quosure>
    -#>   expr: ^toupper(<chr: "a", "b", "c", "d", "e">)
    -#>   env:  global
    -quo(toupper(UQ(letters[1:5])))
    -#> <quosure>
    -#>   expr: ^toupper(<chr: "a", "b", "c", "d", "e">)
    -#>   env:  global
    +

    The first important operation is the basic unquote, !!.

    +

    It is also possible to unquote other quoted expressions. Unquoting such symbolic objects provides a powerful way of manipulating expressions.

    -
    var1 <- quo(letters[1:5])
    -quo(toupper(!! var1))
    -#> <quosure>
    -#>   expr: ^toupper(^letters[1:5])
    -#>   env:  global
    +

    You can safely unquote quosures because they track their environments, and tidyeval functions know how to evaluate them. This allows any depth of quoting and unquoting.

    -
    my_mutate <- function(x) {
    -  mtcars %>%
    -    select(cyl) %>%
    -    slice(1:4) %>%
    -    mutate(cyl2 = cyl + (!! x))
    -}
    -
    -f <- function(x) quo(x)
    -expr1 <- f(100)
    -expr2 <- f(10)
    -
    -my_mutate(expr1)
    -#>   cyl cyl2
    -#> 1   6  106
    -#> 2   6  106
    -#> 3   4  104
    -#> 4   6  106
    -my_mutate(expr2)
    -#>   cyl cyl2
    -#> 1   6   16
    -#> 2   6   16
    -#> 3   4   14
    -#> 4   6   16
    -

    The functional form is useful in cases where the precedence of ! causes problems:

    -
    my_fun <- quo(fun)
    -quo(!! my_fun(x, y, z))
    -#> Error in my_fun(x, y, z): could not find function "my_fun"
    -quo(UQ(my_fun)(x, y, z))
    -#> <quosure>
    -#>   expr: ^^fun(x, y, z)
    -#>   env:  global
    -
    -my_var <- quo(x)
    -quo(filter(df, !! my_var == 1))
    -#> <quosure>
    -#>   expr: ^filter(df, (^x) == 1)
    -#>   env:  global
    -quo(filter(df, UQ(my_var) == 1))
    -#> <quosure>
    -#>   expr: ^filter(df, (^x) == 1)
    -#>   env:  global
    -

    You’ll note above that UQ() yields a quosure containing a formula. That ensures that when the quosure is evaluated, it’ll be looked up in the right environment. In certain code-generation scenarios you just want to use expression and ignore the environment. That’s the job of UQE():

    -
    quo(UQE(my_fun)(x, y, z))
    -#> Warning: `UQE()` is deprecated. Please use `!! get_expr(x)`
    -#> <quosure>
    -#>   expr: ^fun(x, y, z)
    -#>   env:  global
    -quo(filter(df, UQE(my_var) == 1))
    -#> Warning: `UQE()` is deprecated. Please use `!! get_expr(x)`
    -#> <quosure>
    -#>   expr: ^filter(df, x == 1)
    -#>   env:  global
    -

    UQE() is for expert use only as you’ll have to carefully analyse the environments to ensure that the generated code is correct.

    +

    Unquote-splicing

    -

    The second unquote operation is unquote-splicing. Its functional form is UQS() and the syntactic shortcut is !!!. It takes a vector and inserts each element of the vector in the surrounding function call:

    -
    quo(list(!!! letters[1:5]))
    -#> <quosure>
    -#>   expr: ^list("a", "b", "c", "d", "e")
    -#>   env:  global
    +

    The second unquote operation is unquote-splicing, !!!. It takes a vector and inserts each element of the vector in the surrounding function call:

    +

    A very useful feature of unquote-splicing is that the vector names become argument names:

    -
    x <- list(foo = 1L, bar = quo(baz))
    -quo(list(!!! x))
    -#> <quosure>
    -#>   expr: ^list(foo = 1L, bar = ^baz)
    -#>   env:  global
    +

    This makes it easy to program with dplyr verbs that take named dots:

    -
    args <- list(mean = quo(mean(cyl)), count = quo(n()))
    -mtcars %>%
    -  group_by(am) %>%
    -  summarise(!!! args)
    -#> # A tibble: 2 x 3
    -#>      am  mean count
    -#>   <dbl> <dbl> <int>
    -#> 1     0  6.95    19
    -#> 2     1  5.08    13
    +

    Setting variable names

    The final unquote operation is setting argument names. You’ve seen one way to do that above, but you can also use the definition operator := instead of =. := supports unquoting on both the LHS and the RHS.

    The rules on the LHS are slightly different: the unquoted operand should evaluate to a string or a symbol.

    -
    mean_nm <- "mean"
    -count_nm <- "count"
    -
    -mtcars %>%
    -  group_by(am) %>%
    -  summarise(
    -    !! mean_nm := mean(cyl),
    -    !! count_nm := n()
    -  )
    -#> # A tibble: 2 x 3
    -#>      am  mean count
    -#>   <dbl> <dbl> <int>
    -#> 1     0  6.95    19
    -#> 2     1  5.08    13
    +
    diff -Nru r-cran-dplyr-0.7.6/inst/doc/programming.R r-cran-dplyr-0.7.8/inst/doc/programming.R --- r-cran-dplyr-0.7.6/inst/doc/programming.R 2018-06-27 18:50:00.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/doc/programming.R 2018-11-09 21:23:37.000000000 +0000 @@ -102,7 +102,7 @@ ## ------------------------------------------------------------------------ my_summarise <- function(df, group_var) { df %>% - group_by(!! group_var) %>% + group_by(!!group_var) %>% summarise(a = mean(a)) } @@ -117,7 +117,7 @@ print(quo_group_var) df %>% - group_by(!! quo_group_var) %>% + group_by(!!quo_group_var) %>% summarise(a = mean(a)) } @@ -129,7 +129,7 @@ print(group_var) df %>% - group_by(!! group_var) %>% + group_by(!!group_var) %>% summarise(a = mean(a)) } @@ -141,12 +141,12 @@ ## ------------------------------------------------------------------------ my_var <- quo(a) -summarise(df, mean = mean(!! my_var), sum = sum(!! my_var), n = n()) +summarise(df, mean = mean(!!my_var), sum = sum(!!my_var), n = n()) ## ------------------------------------------------------------------------ quo(summarise(df, - mean = mean(!! my_var), - sum = sum(!! my_var), + mean = mean(!!my_var), + sum = sum(!!my_var), n = n() )) @@ -155,8 +155,8 @@ expr <- enquo(expr) summarise(df, - mean = mean(!! expr), - sum = sum(!! expr), + mean = mean(!!expr), + sum = sum(!!expr), n = n() ) } @@ -174,8 +174,8 @@ sum_name <- paste0("sum_", quo_name(expr)) mutate(df, - !! mean_name := mean(!! expr), - !! sum_name := sum(!! expr) + !!mean_name := mean(!!expr), + !!sum_name := sum(!!expr) ) } @@ -186,7 +186,7 @@ group_var <- quos(...) df %>% - group_by(!!! group_var) %>% + group_by(!!!group_var) %>% summarise(a = mean(a)) } @@ -194,10 +194,10 @@ ## ------------------------------------------------------------------------ args <- list(na.rm = TRUE, trim = 0.25) -quo(mean(x, !!! args)) +quo(mean(x, !!!args)) args <- list(quo(x), na.rm = TRUE, trim = 0.25) -quo(mean(!!! args)) +quo(mean(!!!args)) ## ------------------------------------------------------------------------ disp ~ cyl + drat @@ -253,19 +253,18 @@ quo(toupper(letters[1:5])) # Here we capture the value of `letters[1:5]` -quo(toupper(!! letters[1:5])) -quo(toupper(UQ(letters[1:5]))) +quo(toupper(!!letters[1:5])) ## ------------------------------------------------------------------------ var1 <- quo(letters[1:5]) -quo(toupper(!! var1)) +quo(toupper(!!var1)) ## ------------------------------------------------------------------------ my_mutate <- function(x) { mtcars %>% select(cyl) %>% slice(1:4) %>% - mutate(cyl2 = cyl + (!! x)) + mutate(cyl2 = cyl + (!!x)) } f <- function(x) quo(x) @@ -275,31 +274,18 @@ my_mutate(expr1) my_mutate(expr2) -## ---- error = TRUE------------------------------------------------------- -my_fun <- quo(fun) -quo(!! my_fun(x, y, z)) -quo(UQ(my_fun)(x, y, z)) - -my_var <- quo(x) -quo(filter(df, !! my_var == 1)) -quo(filter(df, UQ(my_var) == 1)) - -## ------------------------------------------------------------------------ -quo(UQE(my_fun)(x, y, z)) -quo(filter(df, UQE(my_var) == 1)) - ## ------------------------------------------------------------------------ -quo(list(!!! letters[1:5])) +quo(list(!!!letters[1:5])) ## ------------------------------------------------------------------------ x <- list(foo = 1L, bar = quo(baz)) -quo(list(!!! x)) +quo(list(!!!x)) ## ------------------------------------------------------------------------ args <- list(mean = quo(mean(cyl)), count = quo(n())) mtcars %>% group_by(am) %>% - summarise(!!! args) + summarise(!!!args) ## ------------------------------------------------------------------------ mean_nm <- "mean" @@ -308,7 +294,7 @@ mtcars %>% group_by(am) %>% summarise( - !! mean_nm := mean(cyl), - !! count_nm := n() + !!mean_nm := mean(cyl), + !!count_nm := n() ) diff -Nru r-cran-dplyr-0.7.6/inst/doc/programming.Rmd r-cran-dplyr-0.7.8/inst/doc/programming.Rmd --- r-cran-dplyr-0.7.6/inst/doc/programming.Rmd 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/doc/programming.Rmd 2018-11-09 20:55:35.000000000 +0000 @@ -278,7 +278,7 @@ ```{r} my_summarise <- function(df, group_var) { df %>% - group_by(!! group_var) %>% + group_by(!!group_var) %>% summarise(a = mean(a)) } @@ -304,7 +304,7 @@ print(quo_group_var) df %>% - group_by(!! quo_group_var) %>% + group_by(!!quo_group_var) %>% summarise(a = mean(a)) } @@ -327,7 +327,7 @@ print(group_var) df %>% - group_by(!! group_var) %>% + group_by(!!group_var) %>% summarise(a = mean(a)) } @@ -359,7 +359,7 @@ ```{r} my_var <- quo(a) -summarise(df, mean = mean(!! my_var), sum = sum(!! my_var), n = n()) +summarise(df, mean = mean(!!my_var), sum = sum(!!my_var), n = n()) ``` You can also wrap `quo()` around the dplyr call to see what will happen from @@ -367,8 +367,8 @@ ```{r} quo(summarise(df, - mean = mean(!! my_var), - sum = sum(!! my_var), + mean = mean(!!my_var), + sum = sum(!!my_var), n = n() )) ``` @@ -381,8 +381,8 @@ expr <- enquo(expr) summarise(df, - mean = mean(!! expr), - sum = sum(!! expr), + mean = mean(!!expr), + sum = sum(!!expr), n = n() ) } @@ -405,7 +405,7 @@ * We create the new names by pasting together strings, so we need `quo_name()` to convert the input expression to a string. -* `!! mean_name = mean(!! expr)` isn't valid R code, so we need to +* `!!mean_name = mean(!!expr)` isn't valid R code, so we need to use the `:=` helper provided by rlang. ```{r} @@ -415,8 +415,8 @@ sum_name <- paste0("sum_", quo_name(expr)) mutate(df, - !! mean_name := mean(!! expr), - !! sum_name := sum(!! expr) + !!mean_name := mean(!!expr), + !!sum_name := sum(!!expr) ) } @@ -441,7 +441,7 @@ group_var <- quos(...) df %>% - group_by(!!! group_var) %>% + group_by(!!!group_var) %>% summarise(a = mean(a)) } @@ -453,10 +453,10 @@ ```{r} args <- list(na.rm = TRUE, trim = 0.25) -quo(mean(x, !!! args)) +quo(mean(x, !!!args)) args <- list(quo(x), na.rm = TRUE, trim = 0.25) -quo(mean(!!! args)) +quo(mean(!!!args)) ``` Now that you've learned the basics of tidyeval through some practical examples, @@ -589,16 +589,14 @@ ### Unquoting -The first important operation is the basic unquote, which comes in a functional -form, `UQ()`, and as syntactic-sugar, `!!`. +The first important operation is the basic unquote, `!!`. ```{r} # Here we capture `letters[1:5]` as an expression: quo(toupper(letters[1:5])) # Here we capture the value of `letters[1:5]` -quo(toupper(!! letters[1:5])) -quo(toupper(UQ(letters[1:5]))) +quo(toupper(!!letters[1:5])) ``` It is also possible to unquote other quoted expressions. Unquoting such @@ -606,7 +604,7 @@ ```{r} var1 <- quo(letters[1:5]) -quo(toupper(!! var1)) +quo(toupper(!!var1)) ``` You can safely unquote quosures because they track their environments, and @@ -618,7 +616,7 @@ mtcars %>% select(cyl) %>% slice(1:4) %>% - mutate(cyl2 = cyl + (!! x)) + mutate(cyl2 = cyl + (!!x)) } f <- function(x) quo(x) @@ -629,41 +627,14 @@ my_mutate(expr2) ``` -The functional form is useful in cases where the precedence of `!` causes -problems: - -```{r, error = TRUE} -my_fun <- quo(fun) -quo(!! my_fun(x, y, z)) -quo(UQ(my_fun)(x, y, z)) - -my_var <- quo(x) -quo(filter(df, !! my_var == 1)) -quo(filter(df, UQ(my_var) == 1)) -``` - -You'll note above that `UQ()` yields a quosure containing a formula. That -ensures that when the quosure is evaluated, it'll be looked up in the right -environment. In certain code-generation scenarios you just want to use -expression and ignore the environment. That's the job of `UQE()`: - -```{r} -quo(UQE(my_fun)(x, y, z)) -quo(filter(df, UQE(my_var) == 1)) -``` - -`UQE()` is for expert use only as you'll have to carefully analyse the -environments to ensure that the generated code is correct. - ### Unquote-splicing -The second unquote operation is unquote-splicing. Its functional form is `UQS()` -and the syntactic shortcut is `!!!`. It takes a vector and inserts each element +The second unquote operation is unquote-splicing, `!!!`. It takes a vector and inserts each element of the vector in the surrounding function call: ```{r} -quo(list(!!! letters[1:5])) +quo(list(!!!letters[1:5])) ``` A very useful feature of unquote-splicing is that the vector names @@ -671,7 +642,7 @@ ```{r} x <- list(foo = 1L, bar = quo(baz)) -quo(list(!!! x)) +quo(list(!!!x)) ``` This makes it easy to program with dplyr verbs that take named dots: @@ -680,7 +651,7 @@ args <- list(mean = quo(mean(cyl)), count = quo(n())) mtcars %>% group_by(am) %>% - summarise(!!! args) + summarise(!!!args) ``` @@ -700,7 +671,7 @@ mtcars %>% group_by(am) %>% summarise( - !! mean_nm := mean(cyl), - !! count_nm := n() + !!mean_nm := mean(cyl), + !!count_nm := n() ) ``` diff -Nru r-cran-dplyr-0.7.6/inst/doc/two-table.html r-cran-dplyr-0.7.8/inst/doc/two-table.html --- r-cran-dplyr-0.7.6/inst/doc/two-table.html 2018-06-27 18:50:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/doc/two-table.html 2018-11-09 21:23:39.000000000 +0000 @@ -18,46 +18,256 @@ - + @@ -81,98 +291,101 @@

    Mutating joins

    Mutating joins allow you to combine variables from multiple tables. For example, take the nycflights13 data. In one table we have flight information with an abbreviation for carrier, and in another we have a mapping between abbreviations and full names. You can use a join to add the carrier names to the flight data:

    -
    library("nycflights13")
    -# Drop unimportant variables so it's easier to understand the join results.
    -flights2 <- flights %>% select(year:day, hour, origin, dest, tailnum, carrier)
    -
    -flights2 %>% 
    -  left_join(airlines)
    -#> Joining, by = "carrier"
    -#> # A tibble: 336,776 x 9
    -#>    year month   day  hour origin dest  tailnum carrier name               
    -#>   <int> <int> <int> <dbl> <chr>  <chr> <chr>   <chr>   <chr>              
    -#> 1  2013     1     1     5 EWR    IAH   N14228  UA      United Air Lines I…
    -#> 2  2013     1     1     5 LGA    IAH   N24211  UA      United Air Lines I…
    -#> 3  2013     1     1     5 JFK    MIA   N619AA  AA      American Airlines …
    -#> 4  2013     1     1     5 JFK    BQN   N804JB  B6      JetBlue Airways    
    -#> 5  2013     1     1     6 LGA    ATL   N668DN  DL      Delta Air Lines In…
    -#> # ... with 3.368e+05 more rows
    +

    Controlling how the tables are matched

    As well as x and y, each mutating join takes an argument by that controls which variables are used to match observations in the two tables. There are a few ways to specify it, as I illustrate below with various tables from nycflights13:

    Types of join

    There are four types of mutating join, which differ in their behaviour when a match is not found. We’ll illustrate each with a simple example:

    -
    (df1 <- data_frame(x = c(1, 2), y = 2:1))
    -#> # A tibble: 2 x 2
    -#>       x     y
    -#>   <dbl> <int>
    -#> 1     1     2
    -#> 2     2     1
    -(df2 <- data_frame(x = c(1, 3), a = 10, b = "a"))
    -#> # A tibble: 2 x 3
    -#>       x     a b    
    -#>   <dbl> <dbl> <chr>
    -#> 1     1    10 a    
    -#> 2     3    10 a
    +

    The left, right and full joins are collectively know as outer joins. When a row doesn’t match in an outer join, the new variables are filled in with missing values.

    Observations

    While mutating joins are primarily used to add new variables, they can also generate new observations. If a match is not unique, a join will add all possible combinations (the Cartesian product) of the matching observations:

    -
    df1 <- data_frame(x = c(1, 1, 2), y = 1:3)
    -df2 <- data_frame(x = c(1, 1, 2), z = c("a", "b", "a"))
    -
    -df1 %>% left_join(df2)
    -#> Joining, by = "x"
    -#> # A tibble: 5 x 3
    -#>       x     y z    
    -#>   <dbl> <int> <chr>
    -#> 1     1     1 a    
    -#> 2     1     1 b    
    -#> 3     1     2 a    
    -#> 4     1     2 b    
    -#> 5     2     3 a
    +
    @@ -252,32 +465,32 @@
  • anti_join(x, y) drops all observations in x that have a match in y.
  • These are most useful for diagnosing join mismatches. For example, there are many flights in the nycflights13 dataset that don’t have a matching tail number in the planes table:

    -
    library("nycflights13")
    -flights %>% 
    -  anti_join(planes, by = "tailnum") %>% 
    -  count(tailnum, sort = TRUE)
    -#> # A tibble: 722 x 2
    -#>   tailnum     n
    -#>   <chr>   <int>
    -#> 1 <NA>     2512
    -#> 2 N725MQ    575
    -#> 3 N722MQ    513
    -#> 4 N723MQ    507
    -#> 5 N713MQ    483
    -#> # ... with 717 more rows
    +

    If you’re worried about what observations your joins will match, start with a semi_join() or anti_join(). semi_join() and anti_join() never duplicate; they only ever remove observations.

    -
    df1 <- data_frame(x = c(1, 1, 3, 4), y = 1:4)
    -df2 <- data_frame(x = c(1, 1, 2), z = c("a", "b", "a"))
    -
    -# Four rows to start with:
    -df1 %>% nrow()
    -#> [1] 4
    -# And we get four rows after the join
    -df1 %>% inner_join(df2, by = "x") %>% nrow()
    -#> [1] 4
    -# But only two rows actually match
    -df1 %>% semi_join(df2, by = "x") %>% nrow()
    -#> [1] 2
    +

    Set operations

    @@ -288,100 +501,100 @@
  • setdiff(x, y): return observations in x, but not in y.
  • Given this simple data:

    -
    (df1 <- data_frame(x = 1:2, y = c(1L, 1L)))
    -#> # A tibble: 2 x 2
    -#>       x     y
    -#>   <int> <int>
    -#> 1     1     1
    -#> 2     2     1
    -(df2 <- data_frame(x = 1:2, y = 1:2))
    -#> # A tibble: 2 x 2
    -#>       x     y
    -#>   <int> <int>
    -#> 1     1     1
    -#> 2     2     2
    +

    The four possibilities are:

    -
    intersect(df1, df2)
    -#> # A tibble: 1 x 2
    -#>       x     y
    -#>   <int> <int>
    -#> 1     1     1
    -# Note that we get 3 rows, not 4
    -union(df1, df2)
    -#> # A tibble: 3 x 2
    -#>       x     y
    -#>   <int> <int>
    -#> 1     2     2
    -#> 2     2     1
    -#> 3     1     1
    -setdiff(df1, df2)
    -#> # A tibble: 1 x 2
    -#>       x     y
    -#>   <int> <int>
    -#> 1     2     1
    -setdiff(df2, df1)
    -#> # A tibble: 1 x 2
    -#>       x     y
    -#>   <int> <int>
    -#> 1     2     2
    +

    Coercion rules

    When joining tables, dplyr is a little more conservative than base R about the types of variable that it considers equivalent. This is mostly likely to surprise if you’re working factors:

    Otherwise logicals will be silently upcast to integer, and integer to numeric, but coercing to character will raise an error:

    -
    df1 <- data_frame(x = 1, y = 1L)
    -df2 <- data_frame(x = 2, y = 1.5)
    -full_join(df1, df2) %>% str()
    -#> Joining, by = c("x", "y")
    -#> Classes 'tbl_df', 'tbl' and 'data.frame':    2 obs. of  2 variables:
    -#>  $ x: num  1 2
    -#>  $ y: num  1 1.5
    -
    -df1 <- data_frame(x = 1, y = 1L)
    -df2 <- data_frame(x = 2, y = "a")
    -full_join(df1, df2) %>% str()
    -#> Joining, by = c("x", "y")
    -#> Error in full_join_impl(x, y, by_x, by_y, aux_x, aux_y, na_matches): Can't join on 'y' x 'y' because of incompatible types (character / integer)
    +

    Multiple-table verbs

    diff -Nru r-cran-dplyr-0.7.6/inst/doc/window-functions.html r-cran-dplyr-0.7.8/inst/doc/window-functions.html --- r-cran-dplyr-0.7.6/inst/doc/window-functions.html 2018-06-27 18:50:04.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/doc/window-functions.html 2018-11-09 21:23:40.000000000 +0000 @@ -18,46 +18,256 @@ - + @@ -72,30 +282,30 @@

    A window function is a variation on an aggregation function. Where an aggregation function, like sum() and mean(), takes n inputs and return a single value, a window function returns n values. The output of a window function depends on all its input values, so window functions don’t include functions that work element-wise, like + or round(). Window functions include variations on aggregate functions, like cumsum() and cummean(), functions for ranking and ordering, like rank(), and functions for taking offsets, like lead() and lag().

    In this vignette, we’ll use a small sample of the Lahman batting dataset, including the players that have won an award.

    -
    library(Lahman)
    -
    -batting <- Lahman::Batting %>%
    -  as_tibble() %>%
    -  select(playerID, yearID, teamID, G, AB:H) %>%
    -  arrange(playerID, yearID, teamID) %>%
    -  semi_join(Lahman::AwardsPlayers, by = "playerID")
    -
    -players <- batting %>% group_by(playerID)
    +

    Window functions are used in conjunction with mutate() and filter() to solve a wide range of problems. Here’s a selection:

    -
    # For each player, find the two years with most hits
    -filter(players, min_rank(desc(H)) <= 2 & H > 0)
    -# Within each player, rank each year by the number of games played
    -mutate(players, G_rank = min_rank(G))
    -
    -# For each player, find every year that was better than the previous year
    -filter(players, G > lag(G))
    -# For each player, compute avg change in games played per year
    -mutate(players, G_change = (G - lag(G)) / (yearID - lag(yearID)))
    -
    -# For each player, find all where they played more games than average
    -filter(players, G > mean(G))
    -# For each, player compute a z score based on number of games played
    -mutate(players, G_z = (G - mean(G)) / sd(G))
    +

    Before reading this vignette, you should be familiar with mutate() and filter().

    Types of window functions

    @@ -115,130 +325,130 @@

    Ranking functions

    The ranking functions are variations on a theme, differing in how they handle ties:

    -
    x <- c(1, 1, 2, 2, 2)
    -
    -row_number(x)
    -#> [1] 1 2 3 4 5
    -min_rank(x)
    -#> [1] 1 1 3 3 3
    -dense_rank(x)
    -#> [1] 1 1 2 2 2
    +

    If you’re familiar with R, you may recognise that row_number() and min_rank() can be computed with the base rank() function and various values of the ties.method argument. These functions are provided to save a little typing, and to make it easier to convert between R and SQL.

    Two other ranking functions return numbers between 0 and 1. percent_rank() gives the percentage of the rank; cume_dist() gives the proportion of values less than or equal to the current value.

    -
    cume_dist(x)
    -#> [1] 0.4 0.4 1.0 1.0 1.0
    -percent_rank(x)
    -#> [1] 0.0 0.0 0.5 0.5 0.5
    +

    These are useful if you want to select (for example) the top 10% of records within each group. For example:

    -
    filter(players, cume_dist(desc(G)) < 0.1)
    -#> # A tibble: 1,010 x 7
    -#> # Groups:   playerID [920]
    -#>   playerID  yearID teamID     G    AB     R     H
    -#>   <chr>      <int> <fct>  <int> <int> <int> <int>
    -#> 1 aaronha01   1963 ML1      161   631   121   201
    -#> 2 aaronha01   1968 ATL      160   606    84   174
    -#> 3 abbotji01   1991 CAL       34     0     0     0
    -#> 4 abernte02   1965 CHN       84    18     1     3
    -#> # ... with 1,006 more rows
    +

    Finally, ntile() divides the data up into n evenly sized buckets. It’s a coarse ranking, and it can be used in with mutate() to divide the data into buckets for further summary. For example, we could use ntile() to divide the players within a team into four ranked groups, and calculate the average number of games within each group.

    -
    by_team_player <- group_by(batting, teamID, playerID)
    -by_team <- summarise(by_team_player, G = sum(G))
    -by_team_quartile <- group_by(by_team, quartile = ntile(G, 4))
    -summarise(by_team_quartile, mean(G))
    -#> # A tibble: 4 x 2
    -#>   quartile `mean(G)`
    -#>      <int>     <dbl>
    -#> 1        1      27.2
    -#> 2        2      97.4
    -#> 3        3     271. 
    -#> 4        4     975.
    +

    All ranking functions rank from lowest to highest so that small input values get small ranks. Use desc() to rank from highest to lowest.

    Lead and lag

    lead() and lag() produce offset versions of a input vector that is either ahead of or behind the original vector.

    -
    x <- 1:5
    -lead(x)
    -#> [1]  2  3  4  5 NA
    -lag(x)
    -#> [1] NA  1  2  3  4
    +

    You can use them to:

    lead() and lag() have an optional argument order_by. If set, instead of using the row order to determine which value comes before another, they will use another variable. This important if you have not already sorted the data, or you want to sort one way and lag another.

    Here’s a simple example of what happens if you don’t specify order_by when you need it:

    -
    df <- data.frame(year = 2000:2005, value = (0:5) ^ 2)
    -scrambled <- df[sample(nrow(df)), ]
    -
    -wrong <- mutate(scrambled, running = cumsum(value))
    -arrange(wrong, year)
    -#>   year value running
    -#> 1 2000     0       0
    -#> 2 2001     1      55
    -#> 3 2002     4      20
    -#> 4 2003     9      54
    -#> 5 2004    16      16
    -#> 6 2005    25      45
    -
    -right <- mutate(scrambled, running = order_by(year, cumsum(value)))
    -arrange(right, year)
    -#>   year value running
    -#> 1 2000     0       0
    -#> 2 2001     1       1
    -#> 3 2002     4       5
    -#> 4 2003     9      14
    -#> 5 2004    16      30
    -#> 6 2005    25      55
    +

    Cumulative aggregates

    Base R provides cumulative sum (cumsum()), cumulative min (cummin()) and cumulative max (cummax()). (It also provides cumprod() but that is rarely useful). Other common accumulating functions are cumany() and cumall(), cumulative versions of || and &&, and cummean(), a cumulative mean. These are not included in base R, but efficient versions are provided by dplyr.

    cumany() and cumall() are useful for selecting all rows up to, or all rows after, a condition is true for the first (or last) time. For example, we can use cumany() to find all records for a player after they played a year with 150 games:

    -
    filter(players, cumany(G > 150))
    +

    Like lead and lag, you may want to control the order in which the accumulation occurs. None of the built in functions have an order_by argument so dplyr provides a helper: order_by(). You give it the variable you want to order by, and then the call to the window function:

    -
    x <- 1:10
    -y <- 10:1
    -order_by(y, cumsum(x))
    -#>  [1] 55 54 52 49 45 40 34 27 19 10
    +

    This function uses a bit of non-standard evaluation, so I wouldn’t recommend using it inside another function; use the simpler but less concise with_order() instead.

    Recycled aggregates

    R’s vector recycling make it easy to select values that are higher or lower than a summary. I call this a recycled aggregate because the value of the aggregate is recycled to be the same length as the original vector. Recycled aggregates are useful if you want to find all records greater than the mean or less than the median:

    -
    filter(players, G > mean(G))
    -filter(players, G < median(G))
    +

    While most SQL databases don’t have an equivalent of median() or quantile(), when filtering you can achieve the same effect with ntile(). For example, x > median(x) is equivalent to ntile(x, 2) == 2; x > quantile(x, 75) is equivalent to ntile(x, 100) > 75 or ntile(x, 4) > 3.

    -
    filter(players, ntile(G, 2) == 2)
    +

    You can also use this idea to select the records with the highest (x == max(x)) or lowest value (x == min(x)) for a field, but the ranking functions give you more control over ties, and allow you to select any number of records.

    Recycled aggregates are also useful in conjunction with mutate(). For example, with the batting data, we could compute the “career year”, the number of years a player has played since they entered the league:

    -
    mutate(players, career_year = yearID - min(yearID) + 1)
    -#> # A tibble: 19,404 x 8
    -#> # Groups:   playerID [1,342]
    -#>   playerID  yearID teamID     G    AB     R     H career_year
    -#>   <chr>      <int> <fct>  <int> <int> <int> <int>       <dbl>
    -#> 1 aaronha01   1954 ML1      122   468    58   131           1
    -#> 2 aaronha01   1955 ML1      153   602   105   189           2
    -#> 3 aaronha01   1956 ML1      153   609   106   200           3
    -#> 4 aaronha01   1957 ML1      151   615   118   198           4
    -#> # ... with 19,400 more rows
    +

    Or, as in the introductory example, we could compute a z-score:

    -
    mutate(players, G_z = (G - mean(G)) / sd(G))
    -#> # A tibble: 19,404 x 8
    -#> # Groups:   playerID [1,342]
    -#>   playerID  yearID teamID     G    AB     R     H    G_z
    -#>   <chr>      <int> <fct>  <int> <int> <int> <int>  <dbl>
    -#> 1 aaronha01   1954 ML1      122   468    58   131 -1.16 
    -#> 2 aaronha01   1955 ML1      153   602   105   189  0.519
    -#> 3 aaronha01   1956 ML1      153   609   106   200  0.519
    -#> 4 aaronha01   1957 ML1      151   615   118   198  0.411
    -#> # ... with 19,400 more rows
    +
    diff -Nru r-cran-dplyr-0.7.6/inst/include/dplyr/GroupedDataFrame.h r-cran-dplyr-0.7.8/inst/include/dplyr/GroupedDataFrame.h --- r-cran-dplyr-0.7.6/inst/include/dplyr/GroupedDataFrame.h 2018-06-25 21:55:33.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/include/dplyr/GroupedDataFrame.h 2018-11-09 20:55:35.000000000 +0000 @@ -45,7 +45,7 @@ bool is_lazy = Rf_isNull(data_.attr("group_sizes")) || Rf_isNull(data_.attr("labels")); if (is_lazy) { - build_index_cpp(data_); + build_index_cpp_by_ref(data_); } group_sizes = data_.attr("group_sizes"); biggest_group_size = data_.attr("biggest_group_size"); diff -Nru r-cran-dplyr-0.7.6/inst/include/dplyr/HybridHandler.h r-cran-dplyr-0.7.8/inst/include/dplyr/HybridHandler.h --- r-cran-dplyr-0.7.6/inst/include/dplyr/HybridHandler.h 2018-06-22 22:59:22.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/include/dplyr/HybridHandler.h 2018-11-09 20:55:35.000000000 +0000 @@ -32,4 +32,6 @@ } +void build_index_cpp_by_ref(DataFrame& data); + #endif // dplyr_dplyr_HybridHandlerMap_H diff -Nru r-cran-dplyr-0.7.6/inst/include/dplyr/registration.h r-cran-dplyr-0.7.8/inst/include/dplyr/registration.h --- r-cran-dplyr-0.7.6/inst/include/dplyr/registration.h 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/include/dplyr/registration.h 2018-11-09 20:55:35.000000000 +0000 @@ -5,7 +5,7 @@ #if defined(COMPILING_DPLYR) -void build_index_cpp(DataFrame& data); +DataFrame build_index_cpp(DataFrame data); void registerHybridHandler(const char*, dplyr::HybridHandler); SEXP get_time_classes(); diff -Nru r-cran-dplyr-0.7.6/inst/include/dplyr_RcppExports.h r-cran-dplyr-0.7.8/inst/include/dplyr_RcppExports.h --- r-cran-dplyr-0.7.6/inst/include/dplyr_RcppExports.h 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/inst/include/dplyr_RcppExports.h 2018-11-09 20:55:35.000000000 +0000 @@ -39,6 +39,8 @@ } if (rcpp_result_gen.inherits("interrupted-error")) throw Rcpp::internal::InterruptedException(); + if (Rcpp::internal::isLongjumpSentinel(rcpp_result_gen)) + throw Rcpp::LongjumpException(rcpp_result_gen); if (rcpp_result_gen.inherits("try-error")) throw Rcpp::exception(Rcpp::as(rcpp_result_gen).c_str()); return Rcpp::as(rcpp_result_gen); @@ -58,16 +60,18 @@ } if (rcpp_result_gen.inherits("interrupted-error")) throw Rcpp::internal::InterruptedException(); + if (Rcpp::internal::isLongjumpSentinel(rcpp_result_gen)) + throw Rcpp::LongjumpException(rcpp_result_gen); if (rcpp_result_gen.inherits("try-error")) throw Rcpp::exception(Rcpp::as(rcpp_result_gen).c_str()); return Rcpp::as(rcpp_result_gen); } - inline void build_index_cpp(DataFrame& data) { + inline DataFrame build_index_cpp(DataFrame data) { typedef SEXP(*Ptr_build_index_cpp)(SEXP); static Ptr_build_index_cpp p_build_index_cpp = NULL; if (p_build_index_cpp == NULL) { - validateSignature("void(*build_index_cpp)(DataFrame&)"); + validateSignature("DataFrame(*build_index_cpp)(DataFrame)"); p_build_index_cpp = (Ptr_build_index_cpp)R_GetCCallable("dplyr", "_dplyr_build_index_cpp"); } RObject rcpp_result_gen; @@ -77,8 +81,11 @@ } if (rcpp_result_gen.inherits("interrupted-error")) throw Rcpp::internal::InterruptedException(); + if (Rcpp::internal::isLongjumpSentinel(rcpp_result_gen)) + throw Rcpp::LongjumpException(rcpp_result_gen); if (rcpp_result_gen.inherits("try-error")) throw Rcpp::exception(Rcpp::as(rcpp_result_gen).c_str()); + return Rcpp::as(rcpp_result_gen); } } diff -Nru r-cran-dplyr-0.7.6/man/all_equal.Rd r-cran-dplyr-0.7.8/man/all_equal.Rd --- r-cran-dplyr-0.7.6/man/all_equal.Rd 2018-03-13 20:12:24.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/all_equal.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -5,8 +5,8 @@ \alias{all.equal.tbl_df} \title{Flexible equality comparison for data frames} \usage{ -all_equal(target, current, ignore_col_order = TRUE, ignore_row_order = TRUE, - convert = FALSE, ...) +all_equal(target, current, ignore_col_order = TRUE, + ignore_row_order = TRUE, convert = FALSE, ...) \method{all.equal}{tbl_df}(target, current, ignore_col_order = TRUE, ignore_row_order = TRUE, convert = FALSE, ...) diff -Nru r-cran-dplyr-0.7.6/man/arrange_all.Rd r-cran-dplyr-0.7.8/man/arrange_all.Rd --- r-cran-dplyr-0.7.6/man/arrange_all.Rd 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/arrange_all.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -25,8 +25,7 @@ lambda-formula.} \item{...}{Additional arguments for the function calls in -\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy -dots} support.} +\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy dots} support.} \item{.vars}{A list of columns generated by \code{\link[=vars]{vars()}}, a character vector of column names, a numeric vector of column diff -Nru r-cran-dplyr-0.7.6/man/as.table.tbl_cube.Rd r-cran-dplyr-0.7.8/man/as.table.tbl_cube.Rd --- r-cran-dplyr-0.7.6/man/as.table.tbl_cube.Rd 2018-03-13 20:12:24.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/as.table.tbl_cube.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -3,14 +3,14 @@ \name{as.table.tbl_cube} \alias{as.table.tbl_cube} \alias{as.data.frame.tbl_cube} -\alias{as_data_frame.tbl_cube} +\alias{as_tibble.tbl_cube} \title{Coerce a \code{tbl_cube} to other data structures} \usage{ \method{as.table}{tbl_cube}(x, ..., measure = 1L) \method{as.data.frame}{tbl_cube}(x, ...) -\method{as_data_frame}{tbl_cube}(x, ...) +\method{as_tibble}{tbl_cube}(x, ...) } \arguments{ \item{x}{a \code{tbl_cube}} diff -Nru r-cran-dplyr-0.7.6/man/backend_dbplyr.Rd r-cran-dplyr-0.7.8/man/backend_dbplyr.Rd --- r-cran-dplyr-0.7.6/man/backend_dbplyr.Rd 2018-05-09 07:35:42.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/backend_dbplyr.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -69,7 +69,8 @@ db_query_rows(con, sql, ...) sql_select(con, select, from, where = NULL, group_by = NULL, - having = NULL, order_by = NULL, limit = NULL, distinct = FALSE, ...) + having = NULL, order_by = NULL, limit = NULL, distinct = FALSE, + ...) sql_subquery(con, from, name = random_table_name(), ...) @@ -115,7 +116,7 @@ \item \code{db_create_index()}: Builds and executes a \code{CREATE INDEX ON } SQL command. \item \code{db_drop_table()}: Builds and executes a -\code{DROP TABLE [IF EXISTS]
    } SQL command. +\code{DROP TABLE [IF EXISTS]
    } SQL command. \item \code{db_analyze()}: Builds and executes an \code{ANALYZE
    } SQL command. } diff -Nru r-cran-dplyr-0.7.6/man/bench_compare.Rd r-cran-dplyr-0.7.8/man/bench_compare.Rd --- r-cran-dplyr-0.7.6/man/bench_compare.Rd 2018-03-13 20:12:24.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/bench_compare.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -13,8 +13,8 @@ compare_tbls(tbls, op, ref = NULL, compare = equal_data_frame, ...) -compare_tbls2(tbls_x, tbls_y, op, ref = NULL, compare = equal_data_frame, - ...) +compare_tbls2(tbls_x, tbls_y, op, ref = NULL, + compare = equal_data_frame, ...) eval_tbls(tbls, op) diff -Nru r-cran-dplyr-0.7.6/man/copy_to.Rd r-cran-dplyr-0.7.8/man/copy_to.Rd --- r-cran-dplyr-0.7.6/man/copy_to.Rd 2018-03-13 20:12:24.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/copy_to.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -4,7 +4,8 @@ \alias{copy_to} \title{Copy a local data frame to a remote src} \usage{ -copy_to(dest, df, name = deparse(substitute(df)), overwrite = FALSE, ...) +copy_to(dest, df, name = deparse(substitute(df)), overwrite = FALSE, + ...) } \arguments{ \item{dest}{remote data source} diff -Nru r-cran-dplyr-0.7.6/man/group_by_all.Rd r-cran-dplyr-0.7.8/man/group_by_all.Rd --- r-cran-dplyr-0.7.6/man/group_by_all.Rd 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/group_by_all.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -25,8 +25,7 @@ lambda-formula.} \item{...}{Additional arguments for the function calls in -\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy -dots} support.} +\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy dots} support.} \item{.vars}{A list of columns generated by \code{\link[=vars]{vars()}}, a character vector of column names, a numeric vector of column diff -Nru r-cran-dplyr-0.7.6/man/join.Rd r-cran-dplyr-0.7.8/man/join.Rd --- r-cran-dplyr-0.7.6/man/join.Rd 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/join.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -10,11 +10,13 @@ \alias{anti_join} \title{Join two tbls together} \usage{ -inner_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) +inner_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), + ...) left_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) -right_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) +right_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), + ...) full_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...) diff -Nru r-cran-dplyr-0.7.6/man/join.tbl_df.Rd r-cran-dplyr-0.7.8/man/join.tbl_df.Rd --- r-cran-dplyr-0.7.6/man/join.tbl_df.Rd 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/join.tbl_df.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -14,15 +14,17 @@ suffix = c(".x", ".y"), ..., na_matches = pkgconfig::get_config("dplyr::na_matches")) -\method{left_join}{tbl_df}(x, y, by = NULL, copy = FALSE, suffix = c(".x", - ".y"), ..., na_matches = pkgconfig::get_config("dplyr::na_matches")) +\method{left_join}{tbl_df}(x, y, by = NULL, copy = FALSE, + suffix = c(".x", ".y"), ..., + na_matches = pkgconfig::get_config("dplyr::na_matches")) \method{right_join}{tbl_df}(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ..., na_matches = pkgconfig::get_config("dplyr::na_matches")) -\method{full_join}{tbl_df}(x, y, by = NULL, copy = FALSE, suffix = c(".x", - ".y"), ..., na_matches = pkgconfig::get_config("dplyr::na_matches")) +\method{full_join}{tbl_df}(x, y, by = NULL, copy = FALSE, + suffix = c(".x", ".y"), ..., + na_matches = pkgconfig::get_config("dplyr::na_matches")) \method{semi_join}{tbl_df}(x, y, by = NULL, copy = FALSE, ..., na_matches = pkgconfig::get_config("dplyr::na_matches")) diff -Nru r-cran-dplyr-0.7.6/man/recode.Rd r-cran-dplyr-0.7.8/man/recode.Rd --- r-cran-dplyr-0.7.6/man/recode.Rd 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/recode.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -7,7 +7,8 @@ \usage{ recode(.x, ..., .default = NULL, .missing = NULL) -recode_factor(.x, ..., .default = NULL, .missing = NULL, .ordered = FALSE) +recode_factor(.x, ..., .default = NULL, .missing = NULL, + .ordered = FALSE) } \arguments{ \item{.x}{A vector to modify} diff -Nru r-cran-dplyr-0.7.6/man/sample.Rd r-cran-dplyr-0.7.8/man/sample.Rd --- r-cran-dplyr-0.7.6/man/sample.Rd 2018-03-13 20:12:24.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/sample.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -8,7 +8,8 @@ \usage{ sample_n(tbl, size, replace = FALSE, weight = NULL, .env = NULL) -sample_frac(tbl, size = 1, replace = FALSE, weight = NULL, .env = NULL) +sample_frac(tbl, size = 1, replace = FALSE, weight = NULL, + .env = NULL) } \arguments{ \item{tbl}{tbl of data.} diff -Nru r-cran-dplyr-0.7.6/man/scoped.Rd r-cran-dplyr-0.7.8/man/scoped.Rd --- r-cran-dplyr-0.7.6/man/scoped.Rd 2018-03-25 22:38:34.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/scoped.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -26,8 +26,7 @@ functions and strings representing function names.} \item{...}{Additional arguments for the function calls in -\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy -dots} support.} +\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy dots} support.} } \description{ The variants suffixed with \code{_if}, \code{_at} or \code{_all} apply an diff -Nru r-cran-dplyr-0.7.6/man/select_all.Rd r-cran-dplyr-0.7.8/man/select_all.Rd --- r-cran-dplyr-0.7.6/man/select_all.Rd 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/select_all.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -28,8 +28,7 @@ quosure, a string naming a function, or a function.} \item{...}{Additional arguments for the function calls in -\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy -dots} support.} +\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy dots} support.} \item{.predicate}{A predicate function to be applied to the columns or a logical vector. The variables for which \code{.predicate} is or diff -Nru r-cran-dplyr-0.7.6/man/summarise_all.Rd r-cran-dplyr-0.7.8/man/summarise_all.Rd --- r-cran-dplyr-0.7.6/man/summarise_all.Rd 2018-03-25 22:38:34.000000000 +0000 +++ r-cran-dplyr-0.7.8/man/summarise_all.Rd 2018-11-09 20:55:35.000000000 +0000 @@ -52,8 +52,7 @@ lambda-formula.} \item{...}{Additional arguments for the function calls in -\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy -dots} support.} +\code{.funs}. These are evaluated only once, with \link[rlang:tidy-dots]{tidy dots} support.} \item{.predicate}{A predicate function to be applied to the columns or a logical vector. The variables for which \code{.predicate} is or diff -Nru r-cran-dplyr-0.7.6/MD5 r-cran-dplyr-0.7.8/MD5 --- r-cran-dplyr-0.7.6/MD5 2018-06-29 21:23:20.000000000 +0000 +++ r-cran-dplyr-0.7.8/MD5 2018-11-10 07:30:02.000000000 +0000 @@ -1,8 +1,8 @@ -0889d9c8b30d9d32b4f0a672f93d749a *DESCRIPTION +79c0ccbfa237cbf41f3f0fa2bf8e9d39 *DESCRIPTION c7180788a8ec3035d54fc733f8939ade *LICENSE -46de99cff16daf1f316be421bdf8272f *NAMESPACE -a1cbdaf2e82ce1c522454d3d3a678311 *NEWS.md -aec67570825453d8b843ca0493e6dbb2 *R/RcppExports.R +9bb645db67d147e423d6b1c91166d86f *NAMESPACE +bdbfbbe983fee8136c2bdd6a296ff47f *NEWS.md +24c49ac9179e8e7c06a0c93e12d18430 *R/RcppExports.R 385c938e7ddaf4321adf4ee195f23f64 *R/all-equal.r 9d5b6b054df3934c26af87b53d1d0bc0 *R/bench-compare.r 591535f1f8ba74e9e948a3b7d18e1019 *R/bind.r @@ -13,8 +13,8 @@ 1dd432a4e2d4074eefc4973c8633fb7f *R/colwise-group-by.R b0fa3a0239acaee1ee09446dc9e9e5f3 *R/colwise-mutate.R ee061a80192263f7c5e4e8ab94d24eca *R/colwise-select.R -c481ee14336d094255ad0ebaa6c4e27c *R/colwise.R -e9d206e478ead7a838a9fd73892063ce *R/compat-dbplyr.R +4e2301217278671c2b7d5ffedb79f452 *R/colwise.R +ee44784be10c1bcd3d8c87a5c0148d37 *R/compat-dbplyr.R ba6b339b2ea8ce6d2afaa268832dd23e *R/compat-lazyeval.R 1f93a1dc9780598f4c7fed9392807141 *R/compat-purrr.R dc4ec82504fd1f99c5de95eb3fcfcf41 *R/compute-collect.r @@ -24,27 +24,27 @@ a08ae483d3ebcd598d3b457c77415561 *R/data-nasa.r c11f9a6ce16ea9d659aa93e36f133a42 *R/data-starwars.R 89e5bdb4b68a8fe03836bd4623b251e5 *R/data-storms.R -4e0e30e4a700da2a671267b8d1a2ce8d *R/dataframe.R +88dd82631dcd6ab27e79edc0b6394499 *R/dataframe.R 15ed50b9e7d6f0db6881250facc997cf *R/dbplyr.R 4a41a291a9396a2e2692a05eb8a08982 *R/desc.r -110ab309e577826fec761f71fa0e8cad *R/distinct.R -f4267e6fe5e7daffb691491e73394a02 *R/do.r +fcdc28319a21d46da03c18426c3d80d3 *R/distinct.R +4388bc21ebd2ff8d39d23e0bccd6a15c *R/do.r 1ae45b745dc27114a9a2ace6625b597b *R/dplyr.r ce0d3e95149be325fb2338305b748d92 *R/dr.R 1f23b698c157af3033a5c41580499f6c *R/error.R 64db247d9d02897dfa0770acbe43a9b3 *R/explain.r 55f7f1e4983897a6a1823e06b1931921 *R/failwith.r cb09e9a83f70a9e225d3248619a7477a *R/funs-predicates.R -e20dd90ea648534b9e9c54b138c84c60 *R/funs.R -7fa7593aa7bb4cc8fbe2ae48b201ed97 *R/group-by.r +2150c234d8df92bb8bb350b152664d14 *R/funs.R +c7ca4c2ab4cb1172686f9c17989a1622 *R/group-by.r 22b62e21c7e8a5e4d1bd01290da50773 *R/group-indices.R b9eb655a68656718faed6b6293b0bd12 *R/group-size.r -60029d688edf9c794f28f46e34d87b94 *R/grouped-df.r +6b8476d8f70c79b9cc20d8715d836721 *R/grouped-df.r a218ae2c696a0a109dad9cbb2fa705ef *R/hybrid.R ddab701f521a5f1537d0c6d5b99347ac *R/id.r 2c7111fdc7276ef5801bfa6d0ad34183 *R/if_else.R cbbb63ead2d57c3f54c79af335b67a46 *R/inline.r -5a0d247afadd5c59d0824e7cc5c0a44c *R/join-vars.R +868eb09fc81ec74a83762aae2bc19c18 *R/join-vars.R 220c9537def00e1951fbf7348326fa61 *R/join.r b0b31475ba61638b58eeeff8b6d57a5c *R/lead-lag.R 75cfc3074889bddeaa6e526370c1d55c *R/location.R @@ -53,7 +53,7 @@ ee09f0943c096342251416f4b8fa975d *R/na_if.R 9f92869c84547da11884ad5523fc4697 *R/near.R 3c76c37d9fb4f6f094a75319aaeddae8 *R/nth-value.R -f72d03b55ba939be2f6fad9fcff66e83 *R/order-by.R +3d3bfbebd82b6c43f09c3f6d9e660ede *R/order-by.R 8f1738d2972fa729df8a4307b1bef343 *R/progress.R 1a3ce68627dbedd646cc2cceb7073150 *R/pull.R c0243e22143f7385f31e1db7024f75af *R/rank.R @@ -61,25 +61,25 @@ ec7281d7d70fcb2f03e390cd591011dc *R/recode.R 6d5af4dec50fa749662d716546b8ed9f *R/reexport-tibble.r b371914610f06e43aabaad0a93a9919f *R/reexport-tidyselect.R -d48713ad9d9c4f9ab5147fb01f83b7fa *R/rowwise.r +f09d5f507d4a01ad0815c10a26a4a7f2 *R/rowwise.r eba2f88afe502917471a861fda344314 *R/sample.R c8cff197ad1a21702fc738897ba73688 *R/sets.r 669645c36ac8ea26f05c67f6ef0e4881 *R/src-local.r 4944ea83dd1f1f8c7ddbcda96cbfe354 *R/src.r 4822eceeb65e96b19502e89850855c2f *R/src_dbi.R -eece1aab93ee8a7f8a7777f3caf43e6a *R/tbl-cube.r -9e179f4aa906f8cd0611c439ce23ee85 *R/tbl-df.r +7a5cff71a1ab59570e8b24e94c1e2c84 *R/tbl-cube.r +6ea2ade3c33d422267fc935c6a794523 *R/tbl-df.r 2b8db5ecd1cf6aa8e5a93ad5dd6f8436 *R/tbl.r a261788caa6592561fbdf5c128a573b6 *R/top-n.R 76d10c70d94a48bb5cf644707dee557a *R/ts.R -38f267e41df3e13954180612e140680c *R/utils-expr.R +6266a8aa9698151d1e0b6cf001cb0510 *R/utils-expr.R 65f972e2a462c2ccc70ab73a0513b6bb *R/utils-format.r 96128219b79337ae93ff3be8066bc218 *R/utils-replace-with.R 5488b8bb5b0ce5a37b696ca051e05163 *R/utils-tidy-eval.R -f99c1aee05d186f22c43bc31eba442e9 *R/utils.r -9f3433e33d9bd0e9f80602a8d9f0a01e *R/zzz.r +1c4f0aeec78830576744c954efc892a3 *R/utils.r +fa72c5352e8d4f62af4232918b87eb8f *R/zzz.r 1ba97ecd4d488818d287e974618bf62f *README.md -0dfd519a637d10cfedd01663de4e4426 *build/vignette.rds +b664be72e8bf832a775e10e56e14bb65 *build/vignette.rds a79561c8013e7a7f3c23d509f4918bf8 *data/band_instruments.rda 3aa4b1478fc31219480e88c876c3aeed *data/band_instruments2.rda 4d44ad5e4198daccbd4227dca895750b *data/band_members.rda @@ -88,19 +88,19 @@ beee782d83b4bd711c01658781fbf643 *data/storms.rda 730421f4acdd17a3d2fc909fe5c52347 *inst/doc/compatibility.R 28ae4608638f94301d7020b2ffc6bd09 *inst/doc/compatibility.Rmd -0fa3d105aff3608b24567b43842ddd0f *inst/doc/compatibility.html +95aa1b10003863fd7d1f20fbf1ab277b *inst/doc/compatibility.html 517cadf4db98fab1155e5ee7d9f6b5da *inst/doc/dplyr.R 9b47681a9d447e5d1953702b91e0071d *inst/doc/dplyr.Rmd -a848fe6fb549f188f89ccc85f5a32725 *inst/doc/dplyr.html -96d891611a68aa47639020061604f226 *inst/doc/programming.R -7d1c17cf7d3d4b7dea27bb9145bf08ea *inst/doc/programming.Rmd -a65c186af64df2f47be73103c62e8ea3 *inst/doc/programming.html +f5f869592383ccfe734d187fadd64950 *inst/doc/dplyr.html +f7dcc6328af7ba1f6d8c7e6798a3023b *inst/doc/programming.R +9bbe7305c79f7c52537072f53ac4e97a *inst/doc/programming.Rmd +6d2ebfc16355be6271d5caf73e4f8325 *inst/doc/programming.html 3bc4473fe5f8257cd9342036475ce6c1 *inst/doc/two-table.R 2e918ccb3f55c5edb23e90a66c22ec0f *inst/doc/two-table.Rmd -5244bd1bccdc0bb15e1e34edb49404b5 *inst/doc/two-table.html +3d1b8903fb4f54b909c3a9a2ddb92140 *inst/doc/two-table.html aaaac3e4250247da6ab5015f7db18a2d *inst/doc/window-functions.R d4563df394699b6eab7fe746a4d4170b *inst/doc/window-functions.Rmd -74978e520c65afe6027b552dd141da4f *inst/doc/window-functions.html +8d741999683789654dbba904548d294a *inst/doc/window-functions.html 3e424de5198f078ad18872984990b435 *inst/include/dplyr.h fddaabf5a705772fffff51c593a77f73 *inst/include/dplyr/BoolResult.h d896149e837aaa3f10d7f5b29a38ca0c *inst/include/dplyr/CharacterVectorOrderer.h @@ -114,10 +114,10 @@ 92eadf90f4af99d52f74b5f4c5a8b28c *inst/include/dplyr/DataFrameVisitorsIndexMap.h 806d5dbc5b01c68f3bb80d685d1c9e2d *inst/include/dplyr/EmptySubset.h 08fa0e226d0c30780b145c9a442ca732 *inst/include/dplyr/Gatherer.h -34036ffd3f6a97b738bcbdac814e33ab *inst/include/dplyr/GroupedDataFrame.h +63a0cd442858f5803c3c968936132b89 *inst/include/dplyr/GroupedDataFrame.h e6ae71c9cfe96cad9919c29050b5c396 *inst/include/dplyr/Groups.h 1e822b22eed00c17ac5602ed4fe4f55f *inst/include/dplyr/Hybrid.h -a81736ae950e5967bbdadb8e7f47cb24 *inst/include/dplyr/HybridHandler.h +135834be28c89a8e3de708899a527f3a *inst/include/dplyr/HybridHandler.h 97047891911ed19d9b2713292b0d38b3 *inst/include/dplyr/HybridHandlerMap.h 5ac8463b460c8a8086109f051f707ae0 *inst/include/dplyr/JoinVisitor.h 22a0e143fc83e14203038746d0718744 *inst/include/dplyr/JoinVisitorImpl.h @@ -179,7 +179,7 @@ fcbb0b106f5368eb978b20eb62b6a242 *inst/include/dplyr/get_column.h 3d9793d3f4aed92f771e592d1722315e *inst/include/dplyr/join_match.h 5743df751edac69e36be84d3ef7dd00e *inst/include/dplyr/main.h -82a2dfc8128c4e0f8d64115524af1119 *inst/include/dplyr/registration.h +08a6ad3900bf49dad2a5ffd713e30c05 *inst/include/dplyr/registration.h 90201ca5a1d0b13233666bbb97507d74 *inst/include/dplyr/subset_visitor.h 26ffa0a46db9ee141ddeca22d5efb8d9 *inst/include/dplyr/subset_visitor_impl.h d821ac459208bed5942ba5f94905c550 *inst/include/dplyr/tbl_cpp.h @@ -203,7 +203,7 @@ d97a391a75cd0135078f0f23bc7530a1 *inst/include/dplyr/workarounds.h 0f10f6bdf53eb88da5c34f5c38092485 *inst/include/dplyr/workarounds/static_assert.h acbe82c8b4bffbbdc46a7dcf396f25d5 *inst/include/dplyr/workarounds/xlen.h -29e7a81204baf1ccc10244b064bf8780 *inst/include/dplyr_RcppExports.h +1543159b8d7d754b3c0b7db76be71fd1 *inst/include/dplyr_RcppExports.h 24c588d16003d98d4091a47052041d06 *inst/include/dplyr_types.h 82b929bfe98b923eaf0afa2dd619cd02 *inst/include/solaris/solaris.h c37a2ff952d99b600fceb515f0b870f2 *inst/include/tools/Call.h @@ -226,16 +226,16 @@ 8c918b3701081740dc58a11f435762dd *inst/include/tools/utils.h f5dc52277c4274ed3609c0592e400efb *inst/include/tools/wrap_subset.h a225fd929ff032a7d0de024080c53317 *man/add_rownames.Rd -8ebc5399b6e820089bb6cda7578dc0c2 *man/all_equal.Rd +70ee9ebe9bb2903bc7f86cf199b92e3a *man/all_equal.Rd af4dd958ae305bc8e00b459e72bdd6ae *man/all_vars.Rd bbcae91e34a23f50b7cffe512b6243df *man/arrange.Rd -36c4aba970e7c62b81e1e6912fc5dc1b *man/arrange_all.Rd -d231302df4f1df45262ba47bf8ad4a0a *man/as.table.tbl_cube.Rd +7b92551cb1589450b6442b65b8941849 *man/arrange_all.Rd +3f42a15710cfc2177785058c3b6cc330 *man/as.table.tbl_cube.Rd 4ef9ac02e06e8231e47bc0ba603f047c *man/as.tbl_cube.Rd 137be2eff7b7ad639f186670f6d93a00 *man/auto_copy.Rd -74453eb6f2b017829c024e91922e7396 *man/backend_dbplyr.Rd +da960221d2ac181140e7cacd96c61c82 *man/backend_dbplyr.Rd e81531ed16876cb1bcfc57b89f4e4673 *man/band_members.Rd -f75e5017449d740c6d1064afe0182250 *man/bench_compare.Rd +29e3e6ced8284b1c0212a9a4c1b49f0f *man/bench_compare.Rd 65ccf23d43309ef888abf18a799e8623 *man/between.Rd 748db5adf805eed93534d7065051e1d1 *man/bind.Rd add107ce665c58bdaacda6a6d4543262 *man/case_when.Rd @@ -243,7 +243,7 @@ 51a14a4b6cf7190e4bdf09c89278025e *man/coalesce.Rd a9e659ed5ca31b048ac71cb9e66b383e *man/common_by.Rd 8b57c8dc55db0515b925f21809335928 *man/compute.Rd -cbe3e88cf9b11546a7e0f3134ddf477c *man/copy_to.Rd +fa3a60bf5ff8fb77fa26864f0a5eef96 *man/copy_to.Rd e08d1d0ae0996911495a866663ca4642 *man/cumall.Rd f0c7978518fbd44832836a74288b6bca *man/desc.Rd 88537f8e714f9460fbce7cd0952dd552 *man/dim_desc.Rd @@ -258,7 +258,7 @@ b716bbc065f17127a2df206c7f1aca87 *man/filter_all.Rd 897f5412927c01bffc534dfd477d551d *man/funs.Rd fab411c643cc62ee7ba17a03eab9cc56 *man/group_by.Rd -dc73b132f6437d3e240564022fa60b3e *man/group_by_all.Rd +77c398cbe3646767e505eda50d811817 *man/group_by_all.Rd 1f541b07bab33ec6158e734fbcc28610 *man/group_by_prepare.Rd d9b38100e7ff3cf56ac293fbf1a2127c *man/group_indices.Rd 14a4975898f60b4ad82c2a67d7bceac3 *man/group_size.Rd @@ -268,8 +268,8 @@ 2be780f90d448694b77d55fe3a2d03da *man/ident.Rd ac526ee481cb8465b06783bacaa9d5ea *man/if_else.Rd fed1d0a4957c37ae234ceb655095f717 *man/init_logging.Rd -c8db49f2208b8c26de7754079ea26276 *man/join.Rd -87848403e43b1adcef11fdf4f306e9b2 *man/join.tbl_df.Rd +d2108817050c903297af78d94455c0ff *man/join.Rd +4bbf6e0170b7bc8700c2d877518504ee *man/join.tbl_df.Rd 1fd64a5056b2e7e63cad54a5baef1a0f *man/lead-lag.Rd f2baa420397b241a7ce5524389d30165 *man/location.Rd accd69a53d9368ad2b36f225c9f99c44 *man/make_tbl.Rd @@ -284,15 +284,15 @@ 2366ead0f1e2d68dc23d5d7698509eca *man/progress_estimated.Rd 44e78e319da9183fe0c5d8bc0738600d *man/pull.Rd 05c5ef7f142860d24dcccdd200f674d9 *man/ranking.Rd -07ecaf6b4d22557e58147eb8db6ac7d3 *man/recode.Rd +5733af94ca8f662a4040e528f8274f46 *man/recode.Rd 514dfbe6543e91a19254d99cefcfe2a5 *man/reexports.Rd dd09cefe6c8e23a2d04e28b21f08a58d *man/rowwise.Rd f7b4ab90ebcaec811366c956b4d1401a *man/same_src.Rd -bf6eeb098029398bcbd1f766213c1075 *man/sample.Rd -9aa21dafb48150822784ed5fef0599b8 *man/scoped.Rd +f6144ccda9f3573ce3e1d9e6a9a8a8fc *man/sample.Rd +4f0db8d10b41ab5196b4505e428974ff *man/scoped.Rd c24323c0b2cfc1364f0f0da03feea49f *man/se-deprecated.Rd 3b81619eb6439cb67bb247fe19768765 *man/select.Rd -ae9021de9727fb8c42e05338873fd19e *man/select_all.Rd +98996f49686c5043a5c4a28f992bdcda *man/select_all.Rd 5788eb1f7418b2246a12039a1eccd67b *man/select_vars.Rd f4f385405f6a63285ae63c65e0bae7a0 *man/setops.Rd 2d64a2855a429878cdf3d00743faf3a3 *man/slice.Rd @@ -304,7 +304,7 @@ c52615a2da3cdf11e122bf096a7333ac *man/starwars.Rd 2c982dd196f1cf47ff8c1813e0d20e1b *man/storms.Rd b2ae74a387d27a90411c3faad7978b3b *man/summarise.Rd -dc891b438426a5ef23efab49c47455b1 *man/summarise_all.Rd +821fa84be3b84d9122d3a4e52c72ab4e *man/summarise_all.Rd 3d462c19ed35905604548989ef663ffb *man/summarise_each.Rd 59524e9133823a87065195f6c71f835e *man/tally.Rd 4e70bd464f7c1dfa062b8cd00995742d *man/tbl.Rd @@ -317,7 +317,7 @@ 0f716a00c40a985424f71afd4d758a80 *man/with_order.Rd c1edeace16ce7538551c133b2ecab8b6 *src/Makevars 557d367d9b1adb5e426c68161ff8d243 *src/Makevars.win -1c9e0fb2ae6d04172d5eee0199b1089c *src/RcppExports.cpp +5a98a095adda7880e8c46a56fb7144e2 *src/RcppExports.cpp 37c770f963e3822189e4becaa5824bf3 *src/address.cpp 614d6511f6f57596e6656eaa84bc551f *src/api.cpp f84d2f262e904234bbb29d7a0431c9ff *src/arrange.cpp @@ -327,9 +327,9 @@ 59ecb8fea87aadbae0640d42ba2775c5 *src/distinct.cpp 4ac8ba8d07f6524095b9be5e05ec1426 *src/encoding.cpp c7aa22f5666a99d4d657b3c19eef8192 *src/filter.cpp -1a4e5d6929a9fe3d844bc083f4e3d7ef *src/group_by.cpp -a3d12303780748fd0a39238c26e7a59b *src/group_indices.cpp -fe98cf5946b3caa59f2db29e70e24a73 *src/hybrid.cpp +3bbb1a7bf03321cf94804087cbd09ef8 *src/group_by.cpp +7d60f3089abfd6921f86df907fd87c3f *src/group_indices.cpp +e5c19b71cbba535e4fdb5fe5092777ee *src/hybrid.cpp 546fbb9ae00ac7727dcf20dc46237d85 *src/hybrid_count.cpp 78e04e827edfab65c5fe49e51020c89e *src/hybrid_debug.cpp 4e9e2dfd8f870a7df7c16fd1cf467a7c *src/hybrid_in.cpp @@ -338,7 +338,7 @@ d63bce957d4304b95af4992f54c0c01b *src/hybrid_offset.cpp 2bc08c6ccad4d8d75e9383a5c86b46f5 *src/hybrid_simple.cpp baa55834a69805478bfa019a8d440e8b *src/hybrid_window.cpp -47d2b81ef2ee2317195b9f3eea62a8f5 *src/init.cpp +941719618362655cfc97318a5558eb05 *src/init.cpp 28c97142526601c4817dda1ba4c06985 *src/join.cpp 26d7b69c0b83ed4a3be512d91ba13399 *src/join_exports.cpp 05358049523905cdeddf3f361fa2cc27 *src/mutate.cpp @@ -363,38 +363,38 @@ 6169399fa1c5737d0c318427311a5b27 *tests/testthat/test-arrange.r 49113f7f672913fe11e374d7f5c3c544 *tests/testthat/test-astyle.R 5433288875c5cedb640f9aad2f48550c *tests/testthat/test-between.R -7bdc1f02f1a6b92b37316f8e61b9d634 *tests/testthat/test-binds.R +d795c481b8b069185b960f56cc9cb1f2 *tests/testthat/test-binds.R 815b47c2e3137fddf8085c8aab5bcad5 *tests/testthat/test-case-when.R 84b7a73b11d5900d181d8c9e53abf837 *tests/testthat/test-coalesce.R e7560c50408b988acab314c7be094985 *tests/testthat/test-colwise-arrange.R 15d8630e9a61b3c74c6f9af776c09139 *tests/testthat/test-colwise-filter.R d450ebc86a9ddfba5339233d49aa74f7 *tests/testthat/test-colwise-group-by.R -3e6a3db9f7833b1686e61e9e15fbfa42 *tests/testthat/test-colwise-mutate.R +80847393b6c3d7b1b4c08b5384c3297f *tests/testthat/test-colwise-mutate.R 2533f619c7e6d623e67002ce9cd56598 *tests/testthat/test-colwise-select.R -096d647cd1cfc70ad4ca560754b5b083 *tests/testthat/test-colwise.R +e2c66a9379dcc84ede87c44052c88345 *tests/testthat/test-colwise.R adcb7c47a972c296a670f3961ce912e5 *tests/testthat/test-combine.R a18d0f50f76ac71e95e547a03620fa5a *tests/testthat/test-copy_to.R 39988efc666e80566c47eb579443096d *tests/testthat/test-copying.R 4a2a0b887f0c5d72a4d60b127147b70b *tests/testthat/test-count-tally.r 6ed341fde2d49835a4c45d3c6396a4d5 *tests/testthat/test-data_frame.R 4b336daf8a1a25ba2fee841c10b8b6d1 *tests/testthat/test-distinct.R -6914704423d419169a1a1e503a9e5c6c *tests/testthat/test-do.R +200da13d3f9418bd81b4aab9f2b93f13 *tests/testthat/test-do.R 08cb7c8afe120d2c3cc7e165865bff53 *tests/testthat/test-equality.r 6950c27cd3e4f590b99a2465bb37ea70 *tests/testthat/test-filter.r b4d9becfd0b48b6cb3836e2223be5f8f *tests/testthat/test-funs-predicates.R -8ce8be84ba0af75f196d967cbd2c92e9 *tests/testthat/test-funs.R +39c31e508db759413a1d81bdd9248280 *tests/testthat/test-funs.R 93647608d9519862b831bb04feba2c6d *tests/testthat/test-group-by.r 6ea56bf978c055050c2d8dba2c79519e *tests/testthat/test-group-indices.R bd21aafc0d45b03265d23f5f34b39f8f *tests/testthat/test-group-size.R 35dfee389361c5da9ef4f6d914172338 *tests/testthat/test-hybrid-traverse.R -e37e2615198cd0e8029e8c432d2a2db6 *tests/testthat/test-hybrid.R +cf1b4c526885adf66a642db4cc9cdf0e *tests/testthat/test-hybrid.R b0ed2c4ba15717aa1a196f35a910fd0b *tests/testthat/test-if-else.R 9083be60b404381c0aa9d7a88edd195b *tests/testthat/test-internals.r -47f8fb1b37435926fdbc9e830ba25136 *tests/testthat/test-joins.r +0445c4de819b7c4adf6f2463dee4b4d9 *tests/testthat/test-joins.r bc7752142803d4d4372128057aea3a75 *tests/testthat/test-lazyeval-compat.R a0f400d8e901c0de5bd19c4ebfe2444e *tests/testthat/test-lead-lag.R b6ee165d8c6af461cadd5347cd7f1d5e *tests/testthat/test-mutate-windowed.R -32430c157192e66a246da342ea1b7055 *tests/testthat/test-mutate.r +01b486d6e5e1e1d72361e7726e5c564c *tests/testthat/test-mutate.r 048f33fdb80ba215845bcb54acfe4079 *tests/testthat/test-n_distinct.R 7e5620dc7a74958744f339d8df3a485c *tests/testthat/test-na-if.R 8cd23acee47e4b128d9e46628116ac16 *tests/testthat/test-near.R @@ -406,9 +406,9 @@ 6d8c07435fe9e49da68bdf47770d9417 *tests/testthat/test-recode.R 5494b6afb99875bd699bdbc85d920bdc *tests/testthat/test-sample.R adc3514f9d2945978d775b25cb7e91e3 *tests/testthat/test-select.r -6de5d0b2537faa711966353ef8825cf6 *tests/testthat/test-sets.R +90851a873aa0cacba6a7746874616061 *tests/testthat/test-sets.R 1f449b334718a460db12012fff0dbddb *tests/testthat/test-slice.r -cf4ebbd6679fbb98444b37bf2365f3cf *tests/testthat/test-summarise.r +5a74f42cf3c9b8329174ed396c70a794 *tests/testthat/test-summarise.r 68d9af97a6e3003c798783e899e2f1b1 *tests/testthat/test-tbl-cube.R d7824b3609e1fdd9603588f3a102de01 *tests/testthat/test-tbl.R eb42f84eede2cf04a0dc254530dc0775 *tests/testthat/test-top-n.R @@ -416,12 +416,12 @@ 059ed6902eef67772ce4077422722ecc *tests/testthat/test-ts.R 9eb9a2dfbb9b2a0ed68c7bc12f25d102 *tests/testthat/test-underscore.R 7dff4a2ecad3803f9f55eafbefc50679 *tests/testthat/test-union-all.R -291180e5da776b7734e0dedc1c2f2b4b *tests/testthat/test-utils.R +f6102222212847e5aab003646e52c1ee *tests/testthat/test-utils.R 429e4f338d6170958168c1b5f9cb4da8 *tests/testthat/test-window.R a49019f22ad9e9314d08b68597f369ad *tests/testthat/utf-8.txt 28ae4608638f94301d7020b2ffc6bd09 *vignettes/compatibility.Rmd 9b47681a9d447e5d1953702b91e0071d *vignettes/dplyr.Rmd 45220b8aeb9e08f80d3e401e427ff77a *vignettes/internals/hybrid-evaluation.Rmd -7d1c17cf7d3d4b7dea27bb9145bf08ea *vignettes/programming.Rmd +9bbe7305c79f7c52537072f53ac4e97a *vignettes/programming.Rmd 2e918ccb3f55c5edb23e90a66c22ec0f *vignettes/two-table.Rmd d4563df394699b6eab7fe746a4d4170b *vignettes/window-functions.Rmd diff -Nru r-cran-dplyr-0.7.6/NAMESPACE r-cran-dplyr-0.7.8/NAMESPACE --- r-cran-dplyr-0.7.6/NAMESPACE 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/NAMESPACE 2018-11-09 20:55:35.000000000 +0000 @@ -14,7 +14,6 @@ S3method(as.data.frame,grouped_df) S3method(as.data.frame,rowwise_df) S3method(as.data.frame,tbl_cube) -S3method(as.data.frame,tbl_df) S3method(as.table,tbl_cube) S3method(as.tbl,data.frame) S3method(as.tbl,tbl) @@ -22,8 +21,8 @@ S3method(as.tbl_cube,data.frame) S3method(as.tbl_cube,matrix) S3method(as.tbl_cube,table) -S3method(as_data_frame,grouped_df) -S3method(as_data_frame,tbl_cube) +S3method(as_tibble,grouped_df) +S3method(as_tibble,tbl_cube) S3method(auto_copy,tbl_cube) S3method(auto_copy,tbl_df) S3method(cbind,grouped_df) diff -Nru r-cran-dplyr-0.7.6/NEWS.md r-cran-dplyr-0.7.8/NEWS.md --- r-cran-dplyr-0.7.6/NEWS.md 2018-06-25 21:55:33.000000000 +0000 +++ r-cran-dplyr-0.7.8/NEWS.md 2018-11-09 20:55:35.000000000 +0000 @@ -1,3 +1,21 @@ +# dplyr 0.7.8 + +* Fix return value of `setequal()` for data frames (#3704). + +* Remove `as.data.frame.tbl_df()` method for compatibility with R-devel (#3943). + +* Bump rlang dependency to 0.3.0. + +* Make compatibile with upcoming release of tibble. + +* Remove deprecated functions from programming vignette. + +* Restore interface of the exported C++ function `build_index_cpp()` for compatibility with the _valr_ package. + +# dplyr 0.7.7 + +* Fix invalid character in `NEWS.md` file. + # dplyr 0.7.6 * `exprs()` is no longer exported to avoid conflicts with `Biobase::exprs()` @@ -1488,7 +1506,7 @@ * `mutate()` works for on zero-row grouped data frame, and with list columns (#555). -* `LazySubset` was confused about input data size (#452). +* `LazySubset` was confused about input data size (#452). * Internal `n_distinct()` is stricter about it's inputs: it requires one symbol which must be from the data frame (#567). diff -Nru r-cran-dplyr-0.7.6/R/colwise.R r-cran-dplyr-0.7.8/R/colwise.R --- r-cran-dplyr-0.7.6/R/colwise.R 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/colwise.R 2018-11-09 20:55:35.000000000 +0000 @@ -99,12 +99,12 @@ #' can use with scoped verbs. #' @export all_vars <- function(expr) { - set_attrs(enquo(expr), class = c("all_vars", "quosure", "formula")) + structure(enquo(expr), class = c("all_vars", "quosure", "formula")) } #' @rdname all_vars #' @export any_vars <- function(expr) { - set_attrs(enquo(expr), class = c("any_vars", "quosure", "formula")) + structure(enquo(expr), class = c("any_vars", "quosure", "formula")) } #' @export print.all_vars <- function(x, ...) { @@ -180,7 +180,7 @@ } n <- length(tibble_vars) - selected <- lgl_len(n) + selected <- new_logical(n) for (i in seq_len(n)) { selected[[i]] <- .p(.tbl[[tibble_vars[[i]]]], ...) } diff -Nru r-cran-dplyr-0.7.6/R/compat-dbplyr.R r-cran-dplyr-0.7.8/R/compat-dbplyr.R --- r-cran-dplyr-0.7.6/R/compat-dbplyr.R 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/compat-dbplyr.R 2018-11-09 20:55:35.000000000 +0000 @@ -28,8 +28,8 @@ obj <- getExportedValue("dbplyr", obj_name) obj_sym <- sym(obj_name) - dbplyr_sym <- lang("::", quote(dbplyr), obj_sym) - dplyr_sym <- lang("::", quote(dplyr), obj_sym) + dbplyr_sym <- call("::", quote(dbplyr), obj_sym) + dplyr_sym <- call("::", quote(dplyr), obj_sym) if (is.function(obj)) { args <- formals() diff -Nru r-cran-dplyr-0.7.6/R/dataframe.R r-cran-dplyr-0.7.8/R/dataframe.R --- r-cran-dplyr-0.7.6/R/dataframe.R 2018-06-25 21:55:26.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/dataframe.R 2018-11-09 20:55:35.000000000 +0000 @@ -192,7 +192,7 @@ #' @export setequal.data.frame <- function(x, y, ...) { out <- equal_data_frame(x, y) - reconstruct_set(out, x) + as.logical(out) } reconstruct_set <- function(out, x) { @@ -224,18 +224,17 @@ args <- quos(...) named <- named_args(args) - # Create custom dynamic scope with `.` pronoun - # FIXME: Pass without splicing once child_env() calls env_bind() - # with explicit arguments - overscope <- child_env(NULL, !!!list(. = .data, .data = .data)) + # Create custom data mask with `.` pronoun + mask <- new_data_mask(new_environment()) + env_bind_do_pronouns(mask, .data) if (!named) { - out <- overscope_eval_next(overscope, args[[1]]) + out <- eval_tidy(args[[1]], mask) if (!inherits(out, "data.frame")) { bad("Result must be a data frame, not {fmt_classes(out)}") } } else { - out <- map(args, function(arg) list(overscope_eval_next(overscope, arg))) + out <- map(args, function(arg) list(eval_tidy(arg, mask))) names(out) <- names(args) out <- tibble::as_tibble(out, validate = FALSE) } diff -Nru r-cran-dplyr-0.7.6/R/distinct.R r-cran-dplyr-0.7.8/R/distinct.R --- r-cran-dplyr-0.7.6/R/distinct.R 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/distinct.R 2018-11-09 20:55:35.000000000 +0000 @@ -84,7 +84,10 @@ # If any calls, use mutate to add new columns, then distinct on those .data <- add_computed_columns(.data, vars) - vars <- exprs_auto_name(vars, printer = tidy_text) + with_options( + lifecycle_disable_verbose_retirement = TRUE, + vars <- exprs_auto_name(vars, printer = tidy_text) + ) # Once we've done the mutate, we no longer need lazy objects, and # can instead just use their names diff -Nru r-cran-dplyr-0.7.6/R/do.r r-cran-dplyr-0.7.8/R/do.r --- r-cran-dplyr-0.7.6/R/do.r 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/do.r 2018-11-09 20:55:35.000000000 +0000 @@ -104,6 +104,17 @@ # Helper functions ------------------------------------------------------------- +env_bind_do_pronouns <- function(env, data) { + if (is_function(data)) { + bind <- env_bind_active + } else { + bind <- env_bind + } + + # Use `:=` for `.` to avoid partial matching with `.env` + bind(env, "." := data, .data = data) +} + label_output_dataframe <- function(labels, out, groups) { data_frame <- vapply(out[[1]], is.data.frame, logical(1)) if (any(!data_frame)) { diff -Nru r-cran-dplyr-0.7.6/R/funs.R r-cran-dplyr-0.7.8/R/funs.R --- r-cran-dplyr-0.7.6/R/funs.R 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/funs.R 2018-11-09 20:55:35.000000000 +0000 @@ -65,7 +65,7 @@ args <- list2(...) if (is_fun_list(.x)) { if (!is_empty(args)) { - .x[] <- map(.x, lang_modify, !!!args) + .x[] <- map(.x, call_modify, !!!args) } return(.x) } @@ -95,15 +95,15 @@ expr <- quo_get_expr(quo) - if (is_lang(expr, c("function", "~"))) { - top_level <- fmt_calls(expr[[1]]) + if (is_call(expr, c("function", "~"))) { + top_level <- fmt_obj(as_string(expr[[1]])) bad_args(quo_text(expr), "must be a function name (quoted or unquoted) or an unquoted call, not {top_level}") } - if (is_lang(expr) && !is_lang(expr, c("::", ":::"))) { - expr <- lang_modify(expr, !!!.args) + if (is_call(expr) && !is_call(expr, c("::", ":::"))) { + expr <- call_modify(expr, !!!.args) } else { - expr <- lang(expr, quote(.), !!!.args) + expr <- call2(expr, quote(.), !!!.args) } set_expr(quo, expr) diff -Nru r-cran-dplyr-0.7.6/R/group-by.r r-cran-dplyr-0.7.8/R/group-by.r --- r-cran-dplyr-0.7.6/R/group-by.r 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/group-by.r 2018-11-09 20:55:35.000000000 +0000 @@ -111,7 +111,10 @@ .data <- add_computed_columns(.data, new_groups) # Once we've done the mutate, we need to name all objects - new_groups <- exprs_auto_name(new_groups, printer = tidy_text) + with_options( + lifecycle_disable_verbose_retirement = TRUE, + new_groups <- exprs_auto_name(new_groups, printer = tidy_text) + ) group_names <- names(new_groups) if (add) { diff -Nru r-cran-dplyr-0.7.6/R/grouped-df.r r-cran-dplyr-0.7.8/R/grouped-df.r --- r-cran-dplyr-0.7.6/R/grouped-df.r 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/grouped-df.r 2018-11-09 20:55:35.000000000 +0000 @@ -75,7 +75,7 @@ } #' @export -as_data_frame.grouped_df <- function(x, ...) { +as_tibble.grouped_df <- function(x, ...) { x <- ungroup(x) class(x) <- c("tbl_df", "tbl", "data.frame") x @@ -180,7 +180,7 @@ args <- quos(...) named <- named_args(args) - env <- child_env(NULL) + mask <- new_data_mask(new_environment()) n <- length(index) m <- length(args) @@ -192,8 +192,9 @@ out <- set_names(out, names(args)) out <- label_output_list(labels, out, groups(.data)) } else { - env_bind(.env = env, . = group_data, .data = group_data) - out <- overscope_eval_next(env, args[[1]])[0, , drop = FALSE] + env_bind_do_pronouns(mask, group_data) + out <- eval_tidy(args[[1]], mask) + out <- out[0, , drop = FALSE] out <- label_output_dataframe(labels, list(list(out)), groups(.data)) } return(out) @@ -209,10 +210,7 @@ group_data[index[[`_i`]] + 1L, ] <<- value } } - env_bind_fns(.env = env, . = group_slice, .data = group_slice) - - overscope <- new_overscope(env) - on.exit(overscope_clean(overscope)) + env_bind_do_pronouns(mask, group_slice) out <- replicate(m, vector("list", n), simplify = FALSE) names(out) <- names(args) @@ -220,7 +218,7 @@ for (`_i` in seq_len(n)) { for (j in seq_len(m)) { - out[[j]][`_i`] <- list(overscope_eval_next(overscope, args[[j]])) + out[[j]][`_i`] <- list(eval_tidy(args[[j]], mask)) p$tick()$print() } } diff -Nru r-cran-dplyr-0.7.6/R/join-vars.R r-cran-dplyr-0.7.8/R/join-vars.R --- r-cran-dplyr-0.7.6/R/join-vars.R 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/join-vars.R 2018-11-09 20:55:35.000000000 +0000 @@ -52,7 +52,7 @@ return(x) } - out <- chr_along(x) + out <- rep_along(x, na_chr) for (i in seq_along(x)) { nm <- x[[i]] while (nm %in% y || nm %in% out) { diff -Nru r-cran-dplyr-0.7.6/R/order-by.R r-cran-dplyr-0.7.8/R/order-by.R --- r-cran-dplyr-0.7.6/R/order-by.R 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/order-by.R 2018-11-09 20:55:35.000000000 +0000 @@ -27,7 +27,7 @@ #' arrange(right, year) order_by <- function(order_by, call) { quo <- enquo(call) - if (!quo_is_lang(quo)) { + if (!quo_is_call(quo)) { type <- friendly_type(type_of(get_expr(quo))) bad_args("call", "must be a function call, not { type }") } diff -Nru r-cran-dplyr-0.7.6/R/RcppExports.R r-cran-dplyr-0.7.8/R/RcppExports.R --- r-cran-dplyr-0.7.6/R/RcppExports.R 2018-06-26 07:09:27.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/RcppExports.R 2018-11-09 20:55:35.000000000 +0000 @@ -90,10 +90,6 @@ .Call(`_dplyr_grouped_df_impl`, data, symbols, drop, build_index) } -as_regular_df <- function(df) { - .Call(`_dplyr_as_regular_df`, df) -} - ungroup_grouped_df <- function(df) { .Call(`_dplyr_ungroup_grouped_df`, df) } diff -Nru r-cran-dplyr-0.7.6/R/rowwise.r r-cran-dplyr-0.7.8/R/rowwise.r --- r-cran-dplyr-0.7.6/R/rowwise.r 2018-06-07 13:28:11.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/rowwise.r 2018-11-09 20:55:35.000000000 +0000 @@ -84,12 +84,9 @@ # Create new environment, inheriting from parent, with an active binding # for . that resolves to the current subset. `_i` is found in environment # of this function because of usual scoping rules. - env <- child_env(NULL) + mask <- new_data_mask(new_environment()) current_row <- function() lapply(group_data[`_i`, , drop = FALSE], "[[", 1) - env_bind_fns(.env = env, . = current_row, .data = current_row) - - overscope <- new_overscope(env) - on.exit(overscope_clean(overscope)) + env_bind_do_pronouns(mask, current_row) n <- nrow(.data) m <- length(args) @@ -100,7 +97,7 @@ for (`_i` in seq_len(n)) { for (j in seq_len(m)) { - out[[j]][`_i`] <- list(overscope_eval_next(overscope, args[[j]])) + out[[j]][`_i`] <- list(eval_tidy(args[[j]], mask)) p$tick()$print() } } diff -Nru r-cran-dplyr-0.7.6/R/tbl-cube.r r-cran-dplyr-0.7.8/R/tbl-cube.r --- r-cran-dplyr-0.7.6/R/tbl-cube.r 2018-06-07 13:01:17.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/tbl-cube.r 2018-11-09 20:55:35.000000000 +0000 @@ -187,7 +187,7 @@ #' [tibble::as_data_frame()] resulting data frame contains the #' dimensions as character values (and not as factors). #' @export -as_data_frame.tbl_cube <- function(x, ...) { +as_tibble.tbl_cube <- function(x, ...) { as_data_frame(as.data.frame(x, ..., stringsAsFactors = FALSE)) } diff -Nru r-cran-dplyr-0.7.6/R/tbl-df.r r-cran-dplyr-0.7.8/R/tbl-df.r --- r-cran-dplyr-0.7.6/R/tbl-df.r 2018-06-25 15:41:16.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/tbl-df.r 2018-11-09 20:55:35.000000000 +0000 @@ -6,7 +6,9 @@ #' @keywords internal #' @param data a data frame tbl_df <- function(data) { - as_data_frame(data) + # Works in tibble < 1.5.0 too, because .name_repair will be + # swallowed by the ellipsis + as_tibble(data, .name_repair = "check_unique") } #' @export @@ -27,17 +29,6 @@ as.data.frame(y) } -# Grouping methods ------------------------------------------------------------ - -# These are all inherited from data.frame - see tbl-data-frame.R - -# Standard data frame methods -------------------------------------------------- - -#' @export -as.data.frame.tbl_df <- function(x, row.names = NULL, optional = FALSE, ...) { - as_regular_df(x) -} - # Verbs ------------------------------------------------------------------------ #' @export diff -Nru r-cran-dplyr-0.7.6/R/utils-expr.R r-cran-dplyr-0.7.8/R/utils-expr.R --- r-cran-dplyr-0.7.6/R/utils-expr.R 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/utils-expr.R 2018-11-09 20:55:35.000000000 +0000 @@ -15,7 +15,7 @@ while (!is_null(node)) { switch_expr(node_car(node), language = node_walk_replace(node_cdar(node), old, new), - symbol = if (identical(node_car(node), old)) mut_node_car(node, new) + symbol = if (identical(node_car(node), old)) node_poke_car(node, new) ) node <- node_cdr(node) } @@ -34,7 +34,7 @@ sym_dollar <- quote(`$`) sym_brackets2 <- quote(`[[`) is_data_pronoun <- function(expr) { - is_lang(expr, list(sym_dollar, sym_brackets2)) && + is_call(expr, list(sym_dollar, sym_brackets2)) && identical(node_cadr(expr), quote(.data)) } tidy_text <- function(quo, width = 60L) { @@ -46,6 +46,7 @@ } } named_quos <- function(...) { + scoped_options(lifecycle_disable_verbose_retirement = TRUE) quos <- quos(...) exprs_auto_name(quos, printer = tidy_text) } diff -Nru r-cran-dplyr-0.7.6/R/utils.r r-cran-dplyr-0.7.8/R/utils.r --- r-cran-dplyr-0.7.6/R/utils.r 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/utils.r 2018-11-09 20:55:35.000000000 +0000 @@ -37,13 +37,13 @@ } deparse_all <- function(x) { - x <- map_if(x, is_quosure, quo_expr) + x <- map_if(x, is_quosure, quo_squash) x <- map_if(x, is_formula, f_rhs) map_chr(x, expr_text, width = 500L) } deparse_names <- function(x) { - x <- map_if(x, is_quosure, quo_expr) + x <- map_if(x, is_quosure, quo_squash) x <- map_if(x, is_formula, f_rhs) map_chr(x, deparse) } @@ -91,7 +91,7 @@ } is_negated <- function(x) { - is_lang(x, "-", n = 1) + is_call(x, "-", n = 1) } inc_seq <- function(from, to) { diff -Nru r-cran-dplyr-0.7.6/R/zzz.r r-cran-dplyr-0.7.8/R/zzz.r --- r-cran-dplyr-0.7.6/R/zzz.r 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/R/zzz.r 2018-11-09 20:55:35.000000000 +0000 @@ -6,6 +6,16 @@ toset <- !(names(op.dplyr) %in% names(op)) if (any(toset)) options(op.dplyr[toset]) + local(envir = ns_env("dplyr"), { + delayedAssign("env_bind_active", { + if (utils::packageVersion("rlang") < "0.2.99") { + env_get(ns_env("rlang"), "env_bind_fns") + } else { + env_get(ns_env("rlang"), "env_bind_active") + } + }) + }) + invisible() } diff -Nru r-cran-dplyr-0.7.6/src/group_by.cpp r-cran-dplyr-0.7.8/src/group_by.cpp --- r-cran-dplyr-0.7.6/src/group_by.cpp 2018-06-27 18:50:04.000000000 +0000 +++ r-cran-dplyr-0.7.8/src/group_by.cpp 2018-11-09 21:23:40.000000000 +0000 @@ -20,7 +20,7 @@ if (!symbols.size()) stop("no variables to group by"); if (build_index) { - build_index_cpp(copy); + build_index_cpp_by_ref(copy); } else { strip_index(copy); @@ -28,15 +28,6 @@ return copy; } -// [[Rcpp::export]] -DataFrame as_regular_df(DataFrame df) { - DataFrame copy(shallow_copy(df)); - SET_ATTRIB(copy, strip_group_attributes(df)); - SET_OBJECT(copy, OBJECT(df)); - set_class(copy, CharacterVector::create("data.frame")); - return copy; -} - // [[Rcpp::export]] DataFrame ungroup_grouped_df(DataFrame df) { DataFrame copy(shallow_copy(df)); diff -Nru r-cran-dplyr-0.7.6/src/group_indices.cpp r-cran-dplyr-0.7.8/src/group_indices.cpp --- r-cran-dplyr-0.7.6/src/group_indices.cpp 2018-06-27 18:50:04.000000000 +0000 +++ r-cran-dplyr-0.7.8/src/group_indices.cpp 2018-11-09 21:23:40.000000000 +0000 @@ -39,9 +39,15 @@ return Count().process(gdf); } +// Still need this for valr. +DataFrame build_index_cpp(DataFrame data) { + build_index_cpp_by_ref(data); + return data; +} + // Updates attributes in data by reference! // All these attributes are private to dplyr. -void build_index_cpp(DataFrame& data) { +void build_index_cpp_by_ref(DataFrame& data) { SymbolVector vars(get_vars(data)); const int nvars = vars.size(); diff -Nru r-cran-dplyr-0.7.6/src/hybrid.cpp r-cran-dplyr-0.7.8/src/hybrid.cpp --- r-cran-dplyr-0.7.6/src/hybrid.cpp 2018-06-27 18:50:04.000000000 +0000 +++ r-cran-dplyr-0.7.8/src/hybrid.cpp 2018-11-09 21:23:40.000000000 +0000 @@ -283,7 +283,7 @@ SEXP data; try { data = env.find(sym.get_string()); - } catch (Rcpp::binding_not_found) { + } catch (const Rcpp::binding_not_found&) { return NULL; } diff -Nru r-cran-dplyr-0.7.6/src/init.cpp r-cran-dplyr-0.7.8/src/init.cpp --- r-cran-dplyr-0.7.6/src/init.cpp 2018-06-27 18:50:04.000000000 +0000 +++ r-cran-dplyr-0.7.8/src/init.cpp 2018-11-09 21:23:40.000000000 +0000 @@ -37,4 +37,4 @@ // work around a problem (?) in Rcpp // [[Rcpp::interfaces(cpp)]] // [[Rcpp::export]] -void build_index_cpp(DataFrame& data); +DataFrame build_index_cpp(DataFrame data); diff -Nru r-cran-dplyr-0.7.6/src/RcppExports.cpp r-cran-dplyr-0.7.8/src/RcppExports.cpp --- r-cran-dplyr-0.7.6/src/RcppExports.cpp 2018-06-27 18:50:04.000000000 +0000 +++ r-cran-dplyr-0.7.8/src/RcppExports.cpp 2018-11-09 21:23:40.000000000 +0000 @@ -207,17 +207,6 @@ return rcpp_result_gen; END_RCPP } -// as_regular_df -DataFrame as_regular_df(DataFrame df); -RcppExport SEXP _dplyr_as_regular_df(SEXP dfSEXP) { -BEGIN_RCPP - Rcpp::RObject rcpp_result_gen; - Rcpp::RNGScope rcpp_rngScope_gen; - Rcpp::traits::input_parameter< DataFrame >::type df(dfSEXP); - rcpp_result_gen = Rcpp::wrap(as_regular_df(df)); - return rcpp_result_gen; -END_RCPP -} // ungroup_grouped_df DataFrame ungroup_grouped_df(DataFrame df); RcppExport SEXP _dplyr_ungroup_grouped_df(SEXP dfSEXP) { @@ -282,6 +271,10 @@ UNPROTECT(1); Rf_onintr(); } + bool rcpp_isLongjump_gen = Rcpp::internal::isLongjumpSentinel(rcpp_result_gen); + if (rcpp_isLongjump_gen) { + Rcpp::internal::resumeJump(rcpp_result_gen); + } Rboolean rcpp_isError_gen = Rf_inherits(rcpp_result_gen, "try-error"); if (rcpp_isError_gen) { SEXP rcpp_msgSEXP_gen = Rf_asChar(rcpp_result_gen); @@ -311,6 +304,10 @@ UNPROTECT(1); Rf_onintr(); } + bool rcpp_isLongjump_gen = Rcpp::internal::isLongjumpSentinel(rcpp_result_gen); + if (rcpp_isLongjump_gen) { + Rcpp::internal::resumeJump(rcpp_result_gen); + } Rboolean rcpp_isError_gen = Rf_inherits(rcpp_result_gen, "try-error"); if (rcpp_isError_gen) { SEXP rcpp_msgSEXP_gen = Rf_asChar(rcpp_result_gen); @@ -321,12 +318,13 @@ return rcpp_result_gen; } // build_index_cpp -void build_index_cpp(DataFrame& data); +DataFrame build_index_cpp(DataFrame data); static SEXP _dplyr_build_index_cpp_try(SEXP dataSEXP) { BEGIN_RCPP - Rcpp::traits::input_parameter< DataFrame& >::type data(dataSEXP); - build_index_cpp(data); - return R_NilValue; + Rcpp::RObject rcpp_result_gen; + Rcpp::traits::input_parameter< DataFrame >::type data(dataSEXP); + rcpp_result_gen = Rcpp::wrap(build_index_cpp(data)); + return rcpp_result_gen; END_RCPP_RETURN_ERROR } RcppExport SEXP _dplyr_build_index_cpp(SEXP dataSEXP) { @@ -340,6 +338,10 @@ UNPROTECT(1); Rf_onintr(); } + bool rcpp_isLongjump_gen = Rcpp::internal::isLongjumpSentinel(rcpp_result_gen); + if (rcpp_isLongjump_gen) { + Rcpp::internal::resumeJump(rcpp_result_gen); + } Rboolean rcpp_isError_gen = Rf_inherits(rcpp_result_gen, "try-error"); if (rcpp_isError_gen) { SEXP rcpp_msgSEXP_gen = Rf_asChar(rcpp_result_gen); @@ -675,7 +677,7 @@ if (signatures.empty()) { signatures.insert("SEXP(*get_date_classes)()"); signatures.insert("SEXP(*get_time_classes)()"); - signatures.insert("void(*build_index_cpp)(DataFrame&)"); + signatures.insert("DataFrame(*build_index_cpp)(DataFrame)"); } return signatures.find(sig) != signatures.end(); } @@ -707,7 +709,6 @@ {"_dplyr_n_distinct_multi", (DL_FUNC) &_dplyr_n_distinct_multi, 2}, {"_dplyr_filter_impl", (DL_FUNC) &_dplyr_filter_impl, 2}, {"_dplyr_grouped_df_impl", (DL_FUNC) &_dplyr_grouped_df_impl, 4}, - {"_dplyr_as_regular_df", (DL_FUNC) &_dplyr_as_regular_df, 1}, {"_dplyr_ungroup_grouped_df", (DL_FUNC) &_dplyr_ungroup_grouped_df, 1}, {"_dplyr_test_grouped_df", (DL_FUNC) &_dplyr_test_grouped_df, 1}, {"_dplyr_grouped_indices_grouped_df_impl", (DL_FUNC) &_dplyr_grouped_indices_grouped_df_impl, 1}, diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-binds.R r-cran-dplyr-0.7.8/tests/testthat/test-binds.R --- r-cran-dplyr-0.7.6/tests/testthat/test-binds.R 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-binds.R 2018-11-09 20:55:35.000000000 +0000 @@ -117,13 +117,13 @@ }) test_that("bind_rows only accepts data frames or named vectors", { - ll <- list(1:5, rlang::get_env()) + ll <- list(1:5, env(a = 1)) expect_error( bind_rows(ll), "Argument 1 must have names", fixed = TRUE ) - ll <- list(tibble(a = 1:5), rlang::get_env()) + ll <- list(tibble(a = 1:5), env(a = 1)) expect_error( bind_rows(ll), "Argument 2 must be a data frame or a named atomic vector, not a environment", diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-colwise-mutate.R r-cran-dplyr-0.7.8/tests/testthat/test-colwise-mutate.R --- r-cran-dplyr-0.7.6/tests/testthat/test-colwise-mutate.R 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-colwise-mutate.R 2018-11-09 20:55:35.000000000 +0000 @@ -74,7 +74,7 @@ }) test_that("empty selection does not select everything (#2009, #1989)", { - expect_equal(mtcars, mutate_if(mtcars, is.factor, as.character)) + expect_equal(mutate(mtcars), mutate_if(mtcars, is.factor, as.character)) }) test_that("error is thrown with improper additional arguments", { diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-colwise.R r-cran-dplyr-0.7.8/tests/testthat/test-colwise.R --- r-cran-dplyr-0.7.6/tests/testthat/test-colwise.R 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-colwise.R 2018-11-09 20:55:35.000000000 +0000 @@ -10,7 +10,10 @@ test_that("tbl_at_vars() treats `NULL` as empty inputs", { expect_identical(tbl_at_vars(mtcars, vars(NULL)), tbl_at_vars(mtcars, vars())) - expect_identical(mutate_at(mtcars, vars(NULL), `*`, 100), mtcars) + expect_identical( + tibble::remove_rownames(mutate_at(mtcars, vars(NULL), `*`, 100)), + tibble::remove_rownames(mtcars) + ) }) test_that("tbl_if_vars() errs on bad input", { diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-do.R r-cran-dplyr-0.7.8/tests/testthat/test-do.R --- r-cran-dplyr-0.7.6/tests/testthat/test-do.R 2018-05-03 08:10:51.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-do.R 2018-11-09 20:55:35.000000000 +0000 @@ -116,6 +116,15 @@ expect_equal(f(100)$a, list(100)) }) +# Rowwise data frames ---------------------------------------------------------- + +test_that("can do on rowwise dataframe", { + out <- mtcars %>% rowwise() %>% do(x = 1) + exp <- tibble(x =rep(list(1), nrow(mtcars))) %>% rowwise() + expect_identical(out, exp) +}) + + # Zero row inputs -------------------------------------------------------------- test_that("empty data frames give consistent outputs", { diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-funs.R r-cran-dplyr-0.7.8/tests/testthat/test-funs.R --- r-cran-dplyr-0.7.6/tests/testthat/test-funs.R 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-funs.R 2018-11-09 20:55:35.000000000 +0000 @@ -17,7 +17,7 @@ test_that("funs() accepts unquoted functions", { funs <- funs(fn = !!mean) - expect_identical(funs$fn, new_quosure(lang(base::mean, quote(.)))) + expect_identical(funs$fn, new_quosure(call2(base::mean, quote(.)))) }) test_that("funs() accepts quoted calls", { diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-hybrid.R r-cran-dplyr-0.7.8/tests/testthat/test-hybrid.R --- r-cran-dplyr-0.7.6/tests/testthat/test-hybrid.R 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-hybrid.R 2018-11-09 20:55:35.000000000 +0000 @@ -20,22 +20,22 @@ list(.data) }) - expect_true(env_has(df$f[[1]], "a", inherit = TRUE)) - expect_true(env_has(df$g[[1]], "f", inherit = TRUE)) - expect_true(env_has(df$h[[1]], "g", inherit = TRUE)) + expect_true(env_has(get_env(df$f[[1]]), "a", inherit = TRUE)) + expect_true(env_has(get_env(df$g[[1]]), "f", inherit = TRUE)) + expect_true(env_has(get_env(df$h[[1]]), "g", inherit = TRUE)) expect_warning( - expect_null(env_get(df$f[[1]], "a", inherit = TRUE)), + expect_null(env_get(get_env(df$f[[1]]), "a", inherit = TRUE)), "Hybrid callback proxy out of scope", fixed = TRUE ) expect_warning( - expect_null(env_get(df$g[[1]], "f", inherit = TRUE)), + expect_null(env_get(get_env(df$g[[1]]), "f", inherit = TRUE)), "Hybrid callback proxy out of scope", fixed = TRUE ) expect_warning( - expect_null(env_get(df$h[[1]], "g", inherit = TRUE)), + expect_null(env_get(get_env(df$h[[1]]), "g", inherit = TRUE)), "Hybrid callback proxy out of scope", fixed = TRUE ) diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-joins.r r-cran-dplyr-0.7.8/tests/testthat/test-joins.r --- r-cran-dplyr-0.7.6/tests/testthat/test-joins.r 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-joins.r 2018-11-09 20:55:35.000000000 +0000 @@ -1007,7 +1007,7 @@ expect_error( left_join(df1, df2, by = c("x", "y")), - "Column `x` must have a unique name", + "name", fixed = TRUE ) @@ -1019,7 +1019,7 @@ expect_error( right_join(df1, df2, by = c("x", "y")), - "Column `x` must have a unique name", + "name", fixed = TRUE ) @@ -1031,7 +1031,7 @@ expect_error( inner_join(df1, df2, by = c("x", "y")), - "Column `x` must have a unique name", + "name", fixed = TRUE ) @@ -1043,7 +1043,7 @@ expect_error( full_join(df1, df2, by = c("x", "y")), - "Column `x` must have a unique name", + "name", fixed = TRUE ) @@ -1055,7 +1055,7 @@ expect_error( semi_join(df1, df2, by = c("x", "y")), - "Column `x` must have a unique name", + "name", fixed = TRUE ) @@ -1071,7 +1071,7 @@ expect_error( anti_join(df1, df2, by = c("x", "y")), - "Column `x` must have a unique name", + "name", fixed = TRUE ) @@ -1091,9 +1091,9 @@ df_b <- tibble::tibble(AA = 2:4, C = c("aa", "bb", "cc")) df_aa <- df_a - names(df_aa) <- c(NA, "AA") + attr(df_aa, "names") <- c(NA, "AA") df_ba <- df_b - names(df_ba) <- c("AA", NA) + attr(df_ba, "names") <- c("AA", NA) expect_error( left_join(df_aa, df_b), diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-mutate.r r-cran-dplyr-0.7.8/tests/testthat/test-mutate.r --- r-cran-dplyr-0.7.6/tests/testthat/test-mutate.r 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-mutate.r 2018-11-09 20:55:35.000000000 +0000 @@ -724,14 +724,14 @@ expect_identical(mutate(df, out = !!(1:5)), mutate(df, out = 1:5)) expect_identical(mutate(df, out = !!quote(1:5)), mutate(df, out = 1:5)) expect_error(mutate(df, out = !!(1:2)), "must be length 5 (the number of rows)", fixed = TRUE) - expect_error(mutate(df, out = !!get_env()), "unsupported type") + expect_error(mutate(df, out = !!env(a = 1)), "unsupported type") gdf <- group_by(df, g) expect_identical(mutate(gdf, out = !!1), mutate(gdf, out = 1)) expect_identical(mutate(gdf, out = !!(1:5)), group_by(mutate(df, out = 1:5), g)) expect_error(mutate(gdf, out = !!quote(1:5)), "must be length 2 (the group size)", fixed = TRUE) expect_error(mutate(gdf, out = !!(1:2)), "must be length 5 (the number of rows)", fixed = TRUE) - expect_error(mutate(gdf, out = !!get_env()), "unsupported type") + expect_error(mutate(gdf, out = !!env(a = 1)), "unsupported type") }) test_that("gathering handles promotion from raw", { diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-sets.R r-cran-dplyr-0.7.8/tests/testthat/test-sets.R --- r-cran-dplyr-0.7.6/tests/testthat/test-sets.R 2018-06-25 21:55:26.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-sets.R 2018-11-09 20:55:35.000000000 +0000 @@ -77,3 +77,13 @@ expect_equal(intersect(df1, df2), filter(df1, x >= 3)) expect_equal(union(df1, df2), tibble(x = 1:6, g = rep(1:3, each = 2)) %>% group_by(g)) }) + +test_that("set equality", { + df1 <- tibble(x = 1:4, g = rep(1:2, each = 2)) %>% group_by(g) + df2 <- tibble(x = 3:6, g = rep(2:3, each = 2)) + + expect_true(setequal(df1, df1)) + expect_true(setequal(df2, df2)) + expect_false(setequal(df1, df2)) + expect_false(setequal(df2, df1)) +}) diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-summarise.r r-cran-dplyr-0.7.8/tests/testthat/test-summarise.r --- r-cran-dplyr-0.7.6/tests/testthat/test-summarise.r 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-summarise.r 2018-11-09 20:55:35.000000000 +0000 @@ -964,7 +964,7 @@ }) test_that("ungrouped summarise() uses summary variables correctly (#2404)", { - df <- tibble::as_tibble(seq(1:10)) + df <- tibble(value = seq(1:10)) out <- df %>% summarise(value = mean(value), sd = sd(value)) expect_equal(out$value, 5.5) diff -Nru r-cran-dplyr-0.7.6/tests/testthat/test-utils.R r-cran-dplyr-0.7.8/tests/testthat/test-utils.R --- r-cran-dplyr-0.7.6/tests/testthat/test-utils.R 2018-06-25 13:29:03.000000000 +0000 +++ r-cran-dplyr-0.7.8/tests/testthat/test-utils.R 2018-11-09 20:55:35.000000000 +0000 @@ -14,6 +14,6 @@ test_that("get_vars() handles list of symbols as vars attribute", { gdf <- group_by(tibble(g = 1:2), g) - gdf <- set_attrs(gdf, vars = list(sym("g"))) + gdf <- structure(gdf, vars = list(sym("g"))) expect_identical(test_grouped_df(gdf), gdf) }) diff -Nru r-cran-dplyr-0.7.6/vignettes/programming.Rmd r-cran-dplyr-0.7.8/vignettes/programming.Rmd --- r-cran-dplyr-0.7.6/vignettes/programming.Rmd 2018-05-03 08:08:12.000000000 +0000 +++ r-cran-dplyr-0.7.8/vignettes/programming.Rmd 2018-11-09 20:55:35.000000000 +0000 @@ -278,7 +278,7 @@ ```{r} my_summarise <- function(df, group_var) { df %>% - group_by(!! group_var) %>% + group_by(!!group_var) %>% summarise(a = mean(a)) } @@ -304,7 +304,7 @@ print(quo_group_var) df %>% - group_by(!! quo_group_var) %>% + group_by(!!quo_group_var) %>% summarise(a = mean(a)) } @@ -327,7 +327,7 @@ print(group_var) df %>% - group_by(!! group_var) %>% + group_by(!!group_var) %>% summarise(a = mean(a)) } @@ -359,7 +359,7 @@ ```{r} my_var <- quo(a) -summarise(df, mean = mean(!! my_var), sum = sum(!! my_var), n = n()) +summarise(df, mean = mean(!!my_var), sum = sum(!!my_var), n = n()) ``` You can also wrap `quo()` around the dplyr call to see what will happen from @@ -367,8 +367,8 @@ ```{r} quo(summarise(df, - mean = mean(!! my_var), - sum = sum(!! my_var), + mean = mean(!!my_var), + sum = sum(!!my_var), n = n() )) ``` @@ -381,8 +381,8 @@ expr <- enquo(expr) summarise(df, - mean = mean(!! expr), - sum = sum(!! expr), + mean = mean(!!expr), + sum = sum(!!expr), n = n() ) } @@ -405,7 +405,7 @@ * We create the new names by pasting together strings, so we need `quo_name()` to convert the input expression to a string. -* `!! mean_name = mean(!! expr)` isn't valid R code, so we need to +* `!!mean_name = mean(!!expr)` isn't valid R code, so we need to use the `:=` helper provided by rlang. ```{r} @@ -415,8 +415,8 @@ sum_name <- paste0("sum_", quo_name(expr)) mutate(df, - !! mean_name := mean(!! expr), - !! sum_name := sum(!! expr) + !!mean_name := mean(!!expr), + !!sum_name := sum(!!expr) ) } @@ -441,7 +441,7 @@ group_var <- quos(...) df %>% - group_by(!!! group_var) %>% + group_by(!!!group_var) %>% summarise(a = mean(a)) } @@ -453,10 +453,10 @@ ```{r} args <- list(na.rm = TRUE, trim = 0.25) -quo(mean(x, !!! args)) +quo(mean(x, !!!args)) args <- list(quo(x), na.rm = TRUE, trim = 0.25) -quo(mean(!!! args)) +quo(mean(!!!args)) ``` Now that you've learned the basics of tidyeval through some practical examples, @@ -589,16 +589,14 @@ ### Unquoting -The first important operation is the basic unquote, which comes in a functional -form, `UQ()`, and as syntactic-sugar, `!!`. +The first important operation is the basic unquote, `!!`. ```{r} # Here we capture `letters[1:5]` as an expression: quo(toupper(letters[1:5])) # Here we capture the value of `letters[1:5]` -quo(toupper(!! letters[1:5])) -quo(toupper(UQ(letters[1:5]))) +quo(toupper(!!letters[1:5])) ``` It is also possible to unquote other quoted expressions. Unquoting such @@ -606,7 +604,7 @@ ```{r} var1 <- quo(letters[1:5]) -quo(toupper(!! var1)) +quo(toupper(!!var1)) ``` You can safely unquote quosures because they track their environments, and @@ -618,7 +616,7 @@ mtcars %>% select(cyl) %>% slice(1:4) %>% - mutate(cyl2 = cyl + (!! x)) + mutate(cyl2 = cyl + (!!x)) } f <- function(x) quo(x) @@ -629,41 +627,14 @@ my_mutate(expr2) ``` -The functional form is useful in cases where the precedence of `!` causes -problems: - -```{r, error = TRUE} -my_fun <- quo(fun) -quo(!! my_fun(x, y, z)) -quo(UQ(my_fun)(x, y, z)) - -my_var <- quo(x) -quo(filter(df, !! my_var == 1)) -quo(filter(df, UQ(my_var) == 1)) -``` - -You'll note above that `UQ()` yields a quosure containing a formula. That -ensures that when the quosure is evaluated, it'll be looked up in the right -environment. In certain code-generation scenarios you just want to use -expression and ignore the environment. That's the job of `UQE()`: - -```{r} -quo(UQE(my_fun)(x, y, z)) -quo(filter(df, UQE(my_var) == 1)) -``` - -`UQE()` is for expert use only as you'll have to carefully analyse the -environments to ensure that the generated code is correct. - ### Unquote-splicing -The second unquote operation is unquote-splicing. Its functional form is `UQS()` -and the syntactic shortcut is `!!!`. It takes a vector and inserts each element +The second unquote operation is unquote-splicing, `!!!`. It takes a vector and inserts each element of the vector in the surrounding function call: ```{r} -quo(list(!!! letters[1:5])) +quo(list(!!!letters[1:5])) ``` A very useful feature of unquote-splicing is that the vector names @@ -671,7 +642,7 @@ ```{r} x <- list(foo = 1L, bar = quo(baz)) -quo(list(!!! x)) +quo(list(!!!x)) ``` This makes it easy to program with dplyr verbs that take named dots: @@ -680,7 +651,7 @@ args <- list(mean = quo(mean(cyl)), count = quo(n())) mtcars %>% group_by(am) %>% - summarise(!!! args) + summarise(!!!args) ``` @@ -700,7 +671,7 @@ mtcars %>% group_by(am) %>% summarise( - !! mean_nm := mean(cyl), - !! count_nm := n() + !!mean_nm := mean(cyl), + !!count_nm := n() ) ```