The effect of filtration method on the efficiency of environmental DNA capture and quantification via metabarcoding

Environmental DNA (eDNA) is a promising tool for rapid and noninvasive biodiversity monitoring. eDNA density is low in environmental samples, and a capture method, such as filtration, is often required to concentrate eDNA for downstream analyses. In this study, six treatments, with differing filter types and pore sizes for eDNA capture, were compared for their efficiency and accuracy to assess fish community structure with known fish abundance and biomass via eDNA metabarcoding. Our results showed that different filters (with the exception of 20‐μm large‐pore filters) were broadly consistent in their DNA capture ability. The 0.45‐μm filters performed the best in terms of total DNA yield, probability of species detection, repeatability within pond and consistency between ponds. However performance of 0.45‐μm filters was only marginally better than for 0.8‐μm filters, while filtration time was significantly longer. Given this trade‐off, the 0.8‐μm filter is the optimal pore size of membrane filter for turbid, eutrophic and high fish density ponds analysed here. The 0.45‐μm Sterivex enclosed filters performed reasonably well and are suitable in situations where on‐site filtration is required. Finally, prefilters are applied only if absolutely essential for reducing the filtration time or increasing the throughput volume of the capture filters. In summary, we found encouraging similarity in the results obtained from different filtration methods, but the optimal pore size of filter or filter type might strongly depend on the water type under study.

sizes) and approaches (e.g., on-site or in laboratory) to filtration. Onsite filtration followed by immediate preservation theoretically enhances DNA integrity and is critical for some remote field surveys where access to laboratory facilities is not available. Enclosed filters such Sterivex units (Millipore) or Nalgene analytical test filter funnels (Thermo Fisher Scientific), in combination with a portable peristaltic or hand-driven pump, are popular protocols for the capture of eDNA in the field (Bergman, Schumer, Blankenship, & Campbell, 2016;Keskin, 2014;Spens et al., 2017;Wilcox et al., 2016). However, a larger number of water samples can be filtered simultaneously in a laboratory setting, which reduces the processing time. Four main types of membrane filter (so-called open filters) are commonly used in the laboratory set-ups of freshwater studies: (a) 0.45-lm cellulose nitrate (CN) filters (e.g., Goldberg, Pilliod, Arkle, & Waits, 2011;Pilliod, Goldberg, Arkle, & Waits, 2013), (b) 0.45-lm nylon filters (e.g., Thomsen et al., 2012), (c) 0.7-or 1.5-lm glass fibre (GF) filters (e.g., Miya et al., 2015;Wilcox et al., 2013) and (d) 1.2-lm polycarbonate (PC) filters (e.g., Egan et al., 2015).
The suitability of various pore sizes of filter to capture eDNA may be heavily influenced by the heterogeneous nature of aquatic ecosystems. Suspended particulate matter (SPM, e.g., organic matter and sediment) can quickly block 0.2-or 0.45-lm filters (Minamoto, Naka, Moji, & Maruyama, 2016;Shaw, Weyrich, & Cooper, 2016), which will severely prolong filtration time and potentially increase concentration of PCR inhibitors (McKee, Spear, & Pierson, 2015;Tsai & Olson, 1992). For highly turbid water such as ponds or tropical freshwater ecosystems, even 3-lm PC filters are easily blocked Robson et al., 2016). Most previous studies that have investigated the impact of different types and pore sizes of filter on DNA quantity have focussed on individual target species using real-time quantitative PCR (qPCR) (e.g., Eichmiller et al., 2016;Lacoursiere-Roussel, Rosabal, & Bernatchez, 2016;Minamoto et al., 2016;Robson et al., 2016).
Recently, eDNA-based metabarcoding using high-throughput sequencing has emerged as a powerful tool to monitor entire aquatic communities (e.g., Deiner, Fronhofer, Machler, Walser, & Altermatt, 2016;H€ anfling et al., 2016;Port et al., 2016;Valentini et al., 2016). To our knowledge, few previous studies have investigated if and how the choice filtration method impacts on estimates of fish community composition. The preliminary results of Miya et al. (2016) showed that the number of detected fish species was significantly higher when using enclosed 0.45-lm polyvinylidene difluoride (PVDF) filters compared with 0.7-lm GF filters, although different filtration systems and extraction methods were used in each case. Djurhuus et al. (2017) found that different filter membrane materials (0.2-lm PC, CN, polyethersulfone "PES" and PVDF) and extraction methods did not affect estimates of species richness and community composition across multiple trophic levels. Majaneva et al. (2018) indicated that 0.45-lm MCE filters (described as CN filters in the study) represented the community composition of metazoan more consistently than 0.2-lm PES filters, while the effect of using 12-lm filters as prefilters remained ambiguous.
The aim of this study was to further investigate the impact of different filters on eDNA capture and community diversity estimation through eDNA metabarcoding. Specifically, we compared different pore sizes of membrane filter, different types of filter ("open filters" and "enclosed filters") and the impact of prefiltration. We evaluated the effect on filtration time, total eDNA recovered, probability of species detection, repeatability and the relationship between read counts and known fish abundance or biomass in four fish ponds with differing assemblages.

| Study site and water sampling
This study was carried out at four artificial stock ponds (E1-E4) at the National Coarse Fish Rearing Unit (Nottingham, UK), run by the UK Environment Agency. The size of each pond is 5,100 m 2 (60 m 9 85 m), and the depth is 1-1.5 m. Generally, these ponds are used to rear approximately 1-year-old common British coarse fish from June to January before they are used in stocking programmes for conservation purposes or recreational fishing. All fish were measured and weighed before stocking in the ponds on 15th June 2015 and after harvesting on 18th January 2016. Fish abundance and biomass at the time of water sampling in August 2015 were estimated, assuming that death and growth curves of these fish are linear (Supporting Information Figures S1 and S2). The fish stock information in August 2015 is shown in Table 1.
Water sampling was carried out on 6th August 2015. The dissolved oxygen (DO) concentration was similar between ponds (Mean AE SD, 7.9 AE 0.8 mg/L). For each pond, 12 water samples were collected at evenly distributed points around the shore. A 1-L sterile bottle was used to collect water at each point just below the surface, and then the water was pooled into a 12.5-L sterile water container. After inverting and shaking the collection container, the water was then subsampled with 25 Gosselin 500-ml sterile plastic bottles. All samples were stored in cool boxes, transferred to the eDNA laboratory at University of Hull (UoH) within 2 hrs and refrigerated until filtration.

| eDNA capture treatments
Six filtration-based eDNA capture treatments were used for each pond. These treatments were as follows: (a) "0.45MCE": 0.45-lm mixed cellulose acetate and nitrate (also known as mixed cellulose ester or "MCE") filters, 47 mm diameter (Whatman); (b) "0.8MCE": 0.8-lm MCE filters, 47 mm diameter (Whatman); (c) "1.2MCE": 1.2lm MCE filters, 50 mm diameter (Whatman); (d) "0.45Sterivex": 0.45-lm Sterivex-HV PVDF units (Millipore); (e) "PF_0.45MCE": 0.45-lm MCE filters, 47 mm diameter (Whatman) after prefiltration with 20-lm qualitative cellulose filters, Grade 4 (Whatman); and (f) "PF": the prefilters used in Treatment 5. Each treatment was replicated five times, filtering 300 ml water each time, resulting in a total of 120 replicates. These treatments were used to measure three LI ET AL. To reduce cross-contamination, the samples from individual ponds were filtered separately in order of pond E1-E4. For each replicate (apart from the "0.45Sterivex" treatment), 300 ml water was filtered using Nalgene filtration units (Thermo Fisher Scientific) in combination with a vacuum pump (15-20 in. Hg; Pall Corporation). For each pond, the same filtration unit was used for all five replicates of the same capture treatment. The filtration units were cleaned with 10% v/v commercial bleach solution and 5% v/v microsol detergent (Anachem, UK), and then rinsed thoroughly with deionized water after each filtration to prevent cross-contamination.
Filtration blanks (n = 5) with 300 ml deionized water were run before the first filtration and after every wash run in order to test for possible contamination at the filtration stage. For the "0.45Sterivex" treatment, 300 ml water was directly filtered with 0.45-lm Sterivex units in combination with a vacuum pump (15-20 in. Hg; Pall Corporation). All samples were filtered within 24 hrs of collection in a dedicated eDNA filtration laboratory at UoH.
After filtration, all membrane filters were placed into 50-mm sterile Petri dishes sealed with parafilm, while Sterivex units were closed with inlet and outlet caps. All samples were stored in a freezer at À20°C until DNA extraction. DNA extraction was carried out using the PowerWater (Sterivex) DNA Isolation Kits (MoBio Laboratories Inc., now Qiagen) following the manufacturer's protocol. Total DNA concentration was quantified using a NanoDrop ND-1000 Spectrophotometer (Thermo Fisher Scientific) after extraction.

| Library preparation and sequencing
Extracted DNA samples were PCR-amplified targeting a 106-bp vertebrate-specific fragment of the mitochondrial 12S rRNA region (Riaz et al., 2011) following a one-step library preparation protocol (Kozich, Westcott, Baxter, Highlander, & Schloss, 2013) with amplification primers that include PCR primers, indices and flow cell adapters. Previous studies showed that this fragment has a low false negative rate in both marine mesocosm and coastal ecosystem eDNA metabarcoding studies of bony fishes (Kelly, Port, Yamahara, & Crowder, 2014;Port et al., 2016). We also previously tested this  The maximum-likelihood phylogenetic tree of the all 12S rRNA sequences from the custom reference database is shown in Supporting Information Figure S3. Sequences for which the best BLAST hit had a bit score below 80 or had <100% identity to any sequence in the curated database were considered nontarget sequences. To assure full reproducibility of our bioinformatics analysis, the up to date (May 2017) custom reference database and the Jupyter notebook for data processing have been deposited in an additional dedicated GitHub repository (https://github.com/HullUni-bioinforma tics/Li_et_al_2018_eDNA_filtration).

| Criteria for reducing false positives and quality control
Filtered data were summarized into the number of sequence reads per species (hereon referred to as read counts) for downstream analyses (Supporting Information Appendix S1). We applied two criteria to reduce the possibility of false positives. (a) The low-frequency noise threshold (proportion of positive species read counts of all read counts in the real sample) was set to filter some high-quality annotated reads passing the previous filtering steps that have highconfidence BLAST matches but may be inaccurate due to potential low-level contamination during the library construction process (De Barba et al., 2014;H€ anfling et al., 2016;Port et al., 2016). The lowfrequency noise threshold was set to 0.001 in this study as deter-
To better quantify the heterogeneity between filtration replicates, the Horn similarity index was calculated based on species relative abundance using SPADER v0.1.1 (Chao, Ma, Hsieh, & Chiu, 2016) with the function SimilarityMult.

| Filtration time
The filtration time across all treatments and ponds varied from 3 to 120 min (Figure 3). There were significant effects of "treatment," "pond," the "interaction" between ponds and treatments across the entire data set (

| DNA yield
The DNA concentration across all treatments and ponds ranged from 1.15 to 119.70 ng/ll ( Figure 2). There were significant effects of "treatment," "pond," the "interaction" between ponds and treatments across the entire data set (Table 2, Global). In relation to the specific comparisons: There was no significant effect of different pore sizes of filter (Table 2, Pore sizes, p = 0.07). Comparing the "0.45Sterivex" and the "0.45MCE," there were significant effects of "treatment" and "pond" (Table 2, Filter types). Individual post hoc tests showed that there was no significant difference between using the "0.45Sterivex" and the "0.45MCE" treatments from ponds E1 to E3, but the total DNA yield recovered from the "0.45Sterivex" was significantly lower than the "0.45MCE" in pond E4 (Figure 2d). The average DNA yield recovered from the prefilters themselves ("PF") was the lowest of the six filtration treatments (Supporting Information showed that the total DNA yield recovered from the "0.45MCE" was significantly higher than the "PF_0.45MCE" in pond E4 only ( Figure 2d).  Figure 4). The rarest species in ponds E1 and E2 was A. brama. This species was not detected in pond E2 with any treatment, but it was detected with "0.45Sterivex" in pond E1.

| Probability of species detection
Rutilus rutilus was not detected using the prefilters ("PF") in pond E2 (Figure 4). In ponds E3 and E4, all stocked species were detected by all of the treatments (Figure 4c,d). There were significant effects of "treatment" and "pond" across the entire data set, but there was no significant difference of "interaction" between ponds and treatments ( The Sterivex units ("0.45Sterivex") performed slightly better than the "0.45MCE" in terms of probability of species detection (Table 2, Filter types, p < 0.05). The average probability of species detection was the lowest using the prefilters themselves ("PF") of the six filtration treatments (0.64 AE 0.27, Supporting Information

| Variation between filtration replicates
Overall, there was considerable variation in species composition among individual filtration replicates within ponds (Figure 5a1,b1,c1, d1; Supporting Information Figure S5). In terms of Horn index (similarity between replicates), there were significant effects of "treatment," "pond," the "interaction" between ponds and treatments across the entire data set ( In relation to the specific comparisons: Overall, Horn index significantly decreased with increasing pore size, but the pattern was complex as significant interactions between treatments and ponds were observed (Table 2, Pore sizes). Individual post hoc tests showed that not all pairwise comparisons among pore sizes were significant (e.g., pond E2). The NMDS analysis showed that there was only clear discrimination between the "0.45MCE" and the "0.8MCE" in pond E1 (Figure 5a2; Supporting Information Table S2, ANOSIM: R = 0.52, p = 0.01). There was greater variation among the "0.45Sterivex" replicates compared with the "0.45MCE" replicates ( Figure 5). The community similarity of the "0.45Sterivex" was significantly lower than the "0.45MCE" across four ponds (Table 2, Filter types; Figure 5a1,b1,c1,d1). The NMDS ordination showed that significant difference was observed between the "0.45Sterivex" and the "0.45MCE" in ponds E3 (Figure 5c2; Supporting Information Table S2, ANOSIM: R = 0.64, p = 0.02) and E4 (Figure 5d2; Supporting Information Table S2, ANOSIM: R = 0.30, p = 0.02). Greater variance between replicates was observed for the prefilters ("PF") T A B L E 2 Two-way analysis of variance (ANOVA) results for filtration time, total DNA yield, species detection probability, correlation with abundance and correlation with biomass using six eDNA capture treatments across four ponds (E1-E4)

| Correlations between read counts and fish abundance or biomass
There were consistent, positive correlations between averaged read counts of five replicates and fish abundance or biomass across the six treatments and four ponds ( Figure 6; Supporting Information Figure S6). There was no significant effect of "treatment," or "interaction" between ponds and treatments, on correlations between read counts and abundance or biomass across the entire data set (  Turner et al. (2014) previously determined that aqueous eDNA particles from common carp (Cyprinus carpio) ranged between <0.2 and >180 lm and therefore recommended 0.2-lm pore size filters for optimal capture of common carp eDNA. In a pilot study, we observed that this pore size of filter quickly led to clogging; therefore, we compared three pore sizes (0.45, 0.8 and 1.2 lm) of membrane filter.
Our study demonstrated that the filter pore size had considerable impact on filtration time. When changing from 0.45 to 0.8 lm filters, on average, 36% filtration time was saved, whereas only 15% filtration time was saved increasing pore size from 0.8 to 1.2 lm.
This result supports previous studies (Eichmiller et al., 2016;Minamoto et al., 2016;Turner et al., 2014), indicating that the smaller pore size of filters was more likely to clog and increase filtration time. However, different pore sizes did not affect the amount of total eDNA recovered and probability of species detection. The similarity among filtration replicates decreased with increasing pore size; the repeatability among filtration replicates using the 0.45-lm MCE filters was the highest compared with the other pore sizes of filter.
This in turns indicates that stochastic sampling effects can be minimized using smaller pore size of filters. After pooling that data from all five replicates consistently, positive relationships were found between read counts and fish abundance or biomass, although correlations were not always statistically significant. The 0. F I G U R E 4 Species composition of averaged read counts (number of replicates = 5) using six eDNA capture treatments of eDNA from four ponds (a-d correspond to ponds E1-E4 respectively). Species three letter codes are given in Table 1, and abbreviations of treatments are the same as in Figure 2. "Bio" and "Abu" refer to species composition of fish biomass or abundance calculated based on contrast, Eichmiller et al. (2016) found that different pore sizes (0.2, 0.6, 1.0 and 5.0 lm) of PC filter affected the slope of the C. carpio biomass/eDNA copies relationship; 0.2-0.6 lm filters were optimal for biomass quantification in the laboratory. Turner et al. (2014) showed that PC filters have relatively uniform sized pores, in con- The 0.45-lm MCE filters performed the best among the six filtration treatments in terms of DNA yield, repeatability within pond and consistency between ponds. However, filtration time was significantly longer for the 0.45-lm MCE filters than the 0.8-lm MCE  F I G U R E 5 Pairwise Horn similarity index (A1-D1) and nonmetric multidimensional scaling (NMDS) (a2-d2) based on six eDNA capture treatments from four ponds (a-d correspond to ponds E1-E4, respectively). "Among" refers to all filtration replicates among treatments within pond (a1-d1). Treatments that differ significantly (p < 0.05) are indicated by the different letters in boxplots (a1-d1). The ellipses indicate the 50% standard error of each capture treatment (a2-d2). Species three letter codes are given in Table 1, and abbreviations of treatments are the same as in Figure

| Performance of enclosed (Sterivex) filters
Previous studies showed that filtration using enclosed Sterivex units is an effective protocol for capturing target species DNA with qPCR assays (Bergman et al., 2016;Keskin, 2014;Spens et al., 2017

| Efficiency and impact of prefiltration
The water from Calverton fish ponds is turbid and eutrophic, with high levels of algae. Our pilot study showed that a small amount of water (i.e., 250 ml) could be filtered through 1.2-lm filters before clogging. This is considerably less than previous metabarcoding studies in less eutrophic lakes, in which at least 1 L water was filtered (H€ anfling et al., 2016;Port et al., 2016) and reduced sample volumes could potentially impact rare species detection. Prefiltration could potentially help to prevent clogging, substantially reduce filtration time and reduce the capture of unwanted SPM and PCR inhibitors.
We therefore investigated the impact of prefiltration by comparing results from 0.45-lm MCE filters with and without passing through 20-lm prefilters, as well as the analysing prefilters themselves.
Across the four ponds, it was possible to filter 300 ml water in around 4 min using the prefilters themselves. In terms of the prefilters themselves, the overall probability of species detection (0.64 AE 0.27) was lower than other membrane filters, and greater variance between replicates was observed compared with other treatments. Similar results were found by Robson et al. (2016), who showed that 2 L water samples can be filtered in <3 min using 20-lm filters, but a 0.57 probability of single species detection was achieved compared with 1.00 probability using 3-lm PC filters.
Our results indicate that prefiltration with 20-lm filters could prevent SPM from clogging finer filters without affecting metabarcoding results but that the prefilters themselves are not suitable for metabarcoding due to the potential of reduced total DNA yield, probability of species detection and repeatability. Despite the advantages of prefiltration demonstrated here, it should be noted that there is a drawback of prefiltration in terms of more handling, which could increase the opportunity for contamination (Turner et al., 2014). Thus, we recommend prefilters are applied only if absolutely essential for reducing the filtration time or increasing the throughput volume of the capture filters ( Figure 1).

| CONCLUSION
This study demonstrates that the DNA yield, probability of species detection and correlations between abundance/biomass and read counts are encouragingly comparable between different filter types (0.45-lm MCE filters and 0.45-lm Sterivex units) and pore sizes (0.45, 0.8 and 1.2 lm). Therefore, eDNA metabarcoding results seem quite robust to the choice of the filtration method when a sufficient number of replicates is carried out. We note, however, that the suitability of various pore sizes of filter to capture eDNA is likely to be heavily influenced by the heterogeneous nature of water bodies. For turbid, eutrophic, high fish density ponds, such as those studied here, 0.8-lm MCE filters provide the optimal trade-off between rapid filtration time and probability of species detection, but smaller pore sizes of filter may be more suitable for clearer, low species density conditions. Further study of the impact of heterogeneity (in terms of SPM, biochemical oxygen demand "BOD," chemical oxygen demand "COD," dissolved oxygen "DO," pH, watercolour, etc.) between water bodies on eDNA capture is required. Finally, we report high variation among filtration replicates, which is consistent with Lanz en, Lekang, Jonassen, Thompson, and Troedsson (2017) who indicated that technical replicates of DNA extraction can improve diversity and compositional dissimilarity. Spatial heterogeneity of eDNA within water bodies has also been reported in several studies (e.g., Civade et al., 2016;H€ anfling et al., 2016;Jerde, Mahon, Chadderton, & Lodge, 2011;Pilliod et al., 2013). Future studies, for example, incorporating species occupancy models for imperfect species detection (H€ anfling et al., 2016;Pilliod et al., 2013;Schmidt, Kery, Ursenbacher, Hyman, & Collins, 2013;Valentini et al., 2016), are needed to further investigate the multiple opportunities for heterogeneity encountered in eDNA studies.

ACKNOWLEDG EMENTS
This work is part of PhD project of J.L., who is supported by Univer-

AUTHOR CONTRI BUTIONS
The study was conceived and designed by B