Knowledge and Practice Changes Following a Student Data-Focused Data Management Education Program

Introduction: It is known that graduate students work with research data more intimately than their faculty mentors. Because of this, much data management education is geared toward this population. However, student learning has predominantly been assessed through measures of satisfaction and attendance rather than through evaluating knowledge and skills acquired. This study attempts to advance assessment efforts by asking students to report their knowledge and practice changes before, immediately after, and 6 months following education. Method: Graduate students in science, technology, engineering, and mathematics (STEM) and health science disciplines self-enrolled in an 8-week pilot data management program that used their research projects as the focus for learning. Three surveys were administered (pre-, post-, and 6 months following) to determine changes in students’ knowledge and practices regarding data management skills through self-assessment. The survey consisted of approximately 115 Likert-style questions and covered major aspects of the data life cycle. Results and Discussion: Overall, students increased their data management knowledge and improved their skills in all areas of the data life cycle. Students readily adopted practices for straightforward tasks such as determining storage and improving file naming. Students improved in but struggled with tasks that were more involved, such as sharing data and documenting code. For most of these practices, students consistently implemented them through the 6-month follow-up period. Conclusion: Impact of data management education lasts significantly beyond immediate instruction. In-depth assessment of student knowledge and practices indicates in which areas this education is effective and in which areas it needs further support. It is likely that this effect is due to the program length and focus on implementation. Implications for practice • An increase in data management knowledge and practices continues to at least 6 months with student-research-focused data management education. • Students adopt and keep up simpler practices easily, including file-naming conventions, but need more time and support to adopt complicated practices such as data sharing. • Data management education is effective when presented in a longer format, with compensation/ incentives and time for practice implementation. Received: 02/03/2021 Accepted: 07/07/2021 © 2021 Grif�n. This open access article is distributed under a Creative Commons Attribution 4.0 License (https://crea tivecommons.org/licenses/by/4.0/) ISSN 2162-3309 10.31274/jlsc.12906 jlsc-pub.org eP12906 | 1


INTRODUCTION
In previous work, the author described an approach to data management curriculum that was novel, relative to the literature, in three respects: by framing instruction content in three themes (discovery, infrastructure, and sustainability) that highlight interdependencies between data management practices, by centering all active learning activities on students' research projects, and by using a flipped classroom to give students time for implementing what they learned into their own research projects (Griffin, 2020). This paper follows that work and analyzes the changes in students' knowledge and practices up to 6 months after completing the data management program.

LITERATURE REVIEW
It is recognized that graduate students interact with research data more than faculty (Fong and Wang, 2015) and can have a shared or total responsibility for managing research data (Lage et al., 2011;Peters and Dryden, 2011). In their supervision of students, faculty acknowledge that students cannot yet manage data well (Carlson et al., 2011. But this is not solved during the course of graduate education, as industrial partners who employ these students do not consider them data fluent as they enter the workforce (Van Tuyl and Whitmire, 2018). Students report that they learn data management skills from peer lab members or through their mentor/ supervisor (Pouchard and Bracke, 2016). However, researchers admit that they often do not know how to properly manage data (Tenopir et al., 2011) or do not have good habits . Although researchers are willing to get assistance, they are not always willing to pay for these services (Steinhart et al., 2012). Students are then left on their own to develop data management skills (Carlson et al., 2013;Johnston and Jeffryes, 2014a). Some may learn these skills, as they are associated with research ethics courses or seminars or in laboratory methods coursework, but they are not associated with direct research practice (Johnston and Jeffryes, 2014a). Ultimately, this leaves students without comprehensive formal training and with patchwork informal training (Bishop and Borden, 2020;Johnston and Jeffryes, 2014a).

Identifying speci�c graduate student needs
Graduate students' needs have been indicated in various environmental scans and needs assessments. However, this population has rarely been the focus of these investigations. At least several studies have shown a need for data management training for students, sometimes teased out from a larger population. The majority occurred from 2011 to 2015 and address various aspects of a data life cycle.
Similar to what has been seen in faculty needs assessments, the most common need identified for or by students involves the issues of storage and backup. These issues include what storage options are available within the institution, lab, or work group (Fong and Wang, 2015;Mattern et al., 2015); how long to store active or backup data (Bishop and Borden, 2020;Johnston and Jeffryes, 2014a); dealing with storage constraints and file types (Lage et al., 2011;Marcus et al., 2007); and what practices should be used for proper storage and backup (Adamick et al., 2012;Fong and Wang, 2015;Weller and Monroe-Gulick, 2015). This is not surprising, as Peters noted that students and lab personnel are, in many cases, the ones in control of storage and backup (Peters and Dryden, 2011).
Standards, documentation, and policy were sometimes mentioned as a student need. Many times, this was in relation to understanding how and when to apply data management skills and in what context. Understanding federal and university policies were mentioned (Adamick et al., 2012;Johnston and Jeffryes, 2014a), as were understanding lab policies and understanding values and norms for a given discipline (Pouchard and Bracke, 2016;Valentino and Boock, 2015). Standards were also mentioned in relation to applying metadata and creating codebooks (Carlson et al., 2011;Johnston and Jeffryes, 2014a;Pouchard and Bracke, 2016;Valentino and Boock, 2015).
Last, and mentioned least, were the issues of curating, preserving, and sharing data. Needs assessments mentioned that students share data, but it is mostly within the research community or within laboratory or collaborative groups (Marcus et al., 2007;Peters and Dryden, 2011).

Workshop-based data management education and assessment
Libraries have been continually offering data management education in the form of workshops, embedded lectures, seminar guest lectures, and similar events for over 10 years to address these needs (Fearon et al., 2013;Tenopir et al., 2012Tenopir et al., , 2015. The reported education efforts have taken a variety of formats. Most often, these are workshops (Adamick et al., 2012;Clement et al., 2017;Eaker, 2014;Johnston et al., 2012;Macy and Coates, 2016;Otto, 2012;Surkis et al., 2017;Thielen and Nichols Hess, 2017;Valentino and Boock, 2015), but they have also been integrated into seminar series (Fong and Wang, 2015), embedded in courses (Thielen and Nichols Hess, 2017), or taken place during other events such as a department annual retreat (Fong and Wang, 2015). Also, some were targeted to a specific discipline (Adamick et al., 2012;Fong and Wang, 2015;Macy and Coates, 2016;Thielen and Nichols Hess, 2017), whereas most were discipline agnostic or open to all disciplines. Standalone workshops tended to focus on a single topic such as writing data management plans (Johnston et al., 2012), finding and using data (Otto, 2012), or an overview of data management principles (Adamick et al., 2012;Fong and Wang, 2015;Valentino and Boock, 2015), whereas extended workshop series or day-long workshops tended to dive into a selection of data management practices (Clement et al., 2017;Eaker, 2014;Fong and Wang, 2015;Surkis et al., 2017;Thielen and Nichols Hess, 2017).
As many of these efforts were first attempts at offering data management education in some form, many of the assessments were limited to determination of participant satisfaction. Questions included whether the stated learning objectives were met, how useful the content was, whether the sessions would be recommended to others, what participants learned or still had questions about (or what they liked/did not like), and suggestions for improvement (Adamick et al., 2012;Eaker, 2014;Johnston et al., 2012;Macy and Coates, 2016;Surkis et al., 2017;Valentino and Boock, 2015). A few studies attempted to quantify learning or determine change in understanding. Thielen and Nichols Hess (2017) administered a student selfassessment that rated prior knowledge and perceived change. Clement et al. (2017) surveyed participants 8 months following the workshop, and participants reported they were still able to apply data management principles. Macy and Coates (2016) were able to review the course deliverables for data integration and quality and reported improvement. Finally, Otto (2012) was able to report an increase in finding and using data through both a pre-post test and examination of course assignments.
Attendance was often used as a measure of success, but attendance was not consistently documented or was sometimes recorded as incomplete. An initial number compiled from these studies counts 700 attendees. A total of 86 of those were graduate students, but, because of incompleteness, that number is unreliable. For workshop series, a common complaint was an initial high attendance with attrition over time.

Course-based data management education and assessment
Beginning in 2010, documentation of credit-bearing courses focused on data management began to emerge in the literature. Offering credit-bearing courses solves some problems seen with workshops and other similar formats. Courses encourage students to remain engaged with the materials because of the grading incentive, and they allow a breadth of data management principles to be taught with some depth. These courses also allow a more substantial assessment to be attempted through evaluating course assignments and other student products.
Qin and D'Ignazio (2010) offered a 14-week course that covered topics such as databases and programs, data formats, data description/metadata, managing data, cleaning, storage, access, organization/workflows, curation, archiving, funder requirements, visualization, and ethics. This effort was National Science Foundation funded and was offered for 2 years. Assessment included student feedback on whether assignments were useful or interesting. The authors did administer a pre-post survey asking students to evaluate their attitudes about science and being a data manager. From those results, Qin and D'Ignazio reported that perceived selfcompetence did not change. Whitmire (2015) offered a similar course that ran 11 weeks for two credits. The course covered a comparable range of topics and was partially assessed based on student satisfaction (how well sessions are preparing them to meet learning outcomes, what they liked most/least, suggestions for improvement). The summative assessment was an evaluation of a data management plan as their final project. Whitmore reports that "students rated the course highly" and "quality of data management plans indicated that they met learning objectives" (p. 10).  offered a two-credit, 14-week course designed for engineering graduate students. The content aligned with Accreditation Board for Engineering and Technology accreditation standards. An environmental scan of students and faculty showed that related data information literacy competencies were not currently covered in their education. Course content included writing data management plans, storage, documentation, metadata, visualization, sharing, curation, preservation, Internet Protocol/licensing, and ethics, all with a climate and space science focus. Assessments were based on student reflections (what they learned, whether they had more questions, and instruction effectiveness). Similar to Whitmire (2015), a full data management plan, this time with an accompanying presentation, was the final project. A pre-post survey of student perceptions of importance and current knowledge in nine data information literacy competencies, as outlined in the course content, showed improvement in all competencies assessed. Although the course was meaningful, unfortunately enrollment in the course was low (n = 5).
Schmidt and Holles (2018) offered a three-credit course to science and engineering graduate students for a full semester. Similar to previous efforts, topics covered a similar range of data management skills. For assessment, students rated their self-knowledge in eight areas (data management planning, archiving/preservation, sharing/reuse, legal/ethical concerns, documentation/metadata, storage/backup/security, organization, and types/formats) on a five-point Likert scale (poor = 1 to excellent = 5). In the pre-course assessment, most areas averaged between 2 and 2.5, and only organization was rated 3. Afterward, every area averaged a minimum of 3, and organization knowledge averaged nearly 4. Increases were found for each area of education.
Last, Wiljes and Cimiano (2019) taught 15 data management courses over 6 years in Germany. The course was initially offered as an elective, but, in 2016, two master's programs made it mandatory. Course content remained comparable to what has already been discussed. Also, similar to many workshops and several courses, an assessment reported studentperceived value of content, their willingness to recommend the course to others, and their judgement on the importance of the topic. Attendance was mostly master's students with some bachelor's and doctoral students. All disciplines were welcome, but enrollees were predominantly from science, technology, engineering, and mathematics (STEM) disciplines.
Regardless of the format, most of these education efforts, particularly workshops, assess their success by participant satisfaction and attendance numbers. Coates et al. (2018) recognize that data management services by libraries are maturing, and efforts at more substantial assessment could be attempted. This is seen in the work by Schmidt and Holles (2018) and by , but only Clement's (2018) study evaluated whether the education had any lasting effect. This paper advances assessment efforts by asking students to self-report their knowledge and practice changes before, immediately after, and 6 months following data management education. Because of the design of the program, this study is also able to obtain more granular information about student learning data management skills beyond improvement in broad foundational concepts.

METHODS
As previously described, students participated in an 8-week pilot data management program that used a flipped classroom approach, centered the activities entirely on their current research projects, and dedicated time for participants to implement data management infrastructure into their projects (Griffin, 2020). A pretest was distributed after acceptance to the program and was required to be completed before the sessions started. The posttest was distributed and completed within 2 weeks after the program ended. The 6-month follow-up was distributed February 2020 and was completed within 2 weeks. The full survey consisted of about 115 questions regarding knowledge, confidence, and practices regarding data management and took 30 to 45 minutes to complete. Data regarding student knowledge and practices are reported here.
Surveys were developed with the support of our Survey Research Lab. A senior graduate student not participating in the program also reviewed the questions, and the instruments were subsequently edited for bias and clarity. The Institutional Review Board reviewed the instruments and determined them to be exempt (#2019-0048). The survey was built in Qualtrics (Qualtrics, Seattle, Washington and Provo, Utah) and distributed to participants by email. The survey instruments can be found in Appendix A. Participants were compensated for participation in the program and completion of the surveys for a total of $150 in three distributions based on completion.
This pilot program aligned with the data information literacy competencies , and assessment was guided by those standards. Directly measuring skills is difficult at this time because of the differences in subject discipline needs regarding data management, the lack of standards in any discipline, and lack of infrastructure within institutions. Developing a single, measurable assessment instrument that directly assessed skills across these gaps and with granularity beyond what has been published to date was impossible. Therefore, participants were asked to assess their knowledge and practice changes and, at the end of the program, were asked to evaluate whether they over-, under-, or correctly estimated their knowledge in these areas. Likert scales were used for most questions. Participants were asked about their familiarity with data management concepts or information ("Not At All," "Not Very," "Somewhat," "Very," "Extremely") and how often they practiced a given skill ("Never," "Rarely," "Sometimes," "Often," "Always"). Questions were organized by data management domain and roughly follow different stages of a data life cycle. These domains include standards and documentation, organization and file management, storage and backup, and sharing.
Responses were downloaded and coded to anonymize individuals. Incomplete surveys were not included (participants who did not "submit" at the end), but individuals could skip questions. A total of 26 participants submitted the pretest, 18 submitted the posttest, and 16 submitted the 6-month follow-up. Analysis within this paper includes basic counts, percentages, and descriptive statistics. Responses regarding knowledge and practice changes before, after, and 6 months following data management education are reported.

RESULTS
Program participants were restricted to students in STEM or health science disciplines who were in a graduate-level program working on original research. A total of 26 participants enrolled in the program. Most were obtaining doctoral degrees (24), and two were pursuing a master's. Seven disciplines were represented: nursing, medicine, public health, applied health sciences, engineering, liberal arts and sciences, and pharmacy. Students' progress toward their degrees were from the first year of study to within 6 months of graduation. Before beginning, students reported that only three of them had previous data management education, and this was through a credit-bearing course (3/26 [11%]); the rest had none.

Standards and documentation
When asked how familiar students are with documentation standards required by journals, funders, or their research community, on average, most replied that they were "Not Very" familiar. After the program, they reported being at least "Somewhat" familiar ( Figure 1). They also reported an increased familiarity with established metadata standards, ontologies, data elements, and similar items averaging from "Not Very" to at least "Somewhat" familiar. This was retained through the 6-month follow-up. Relatedly, when asked to give a definition for metadata, about half could give a correct or partially correct one, but, after education, almost all could give a solid, correct definition ( Figure 2). When asked more specific questions regarding their documentation knowledge, they increased familiarity ("Somewhat" to "Very") with wayfinding tools such as tables of contents and creating identification systems for data and increased from "Not at All" to "Somewhat" with creating inventories.
For the infrastructure work that they did create and implement, students reported that they were creating more continuity and context throughout their projects (Table 1).

Organization and �le management
Students were, on average, "Somewhat" familiar with folder organization practices before the program began. Afterward, they reported being "Very" familiar with them both immediately after education and 6 months following. Half of these students (13/25 [52%]) already had an intentional folder organization scheme, and, after the program, 89% had one, and this continued through the 6 months. Of those with an intentional folder organization at each time point, most said that they are "Often" consistent with following this scheme (pretest = 61.5%  Students reported being "Not Very" familiar with file management practices, which include file-naming conventions, versioning files, and working with file management software (Figure 3). After the program, they reported being "Very" familiar with naming and versioning but still "Not Very" familiar with software that manages files such as electronic lab notebooks (ELNs). Most students "Rarely" or "Never" use software to manage files even after the program and through the 6-month follow-up.  When asked, only 24% (6/25) of students reported having a standardized file-naming convention initially. After the program, 83% did (15/18), and, 6 months following, 94% did (15/16). If the students did have a file-naming convention, most of them reported that the conventions were consistent across projects, across all time points asked. For those that had one, most consistently followed that convention "Often" but not "Always." Immediately after and during the follow-up, they still consistently followed it "Often," but one-third were now "Always" following it.
Because including dates in file names is common practice, students were also asked about their date conventions. Half of the students already used a standard date format in their file names, and, immediately after the program, 77% (13/18) and later 81% (13/16) did. Similar to what was seen with following their organization scheme, if students included or had started including a standard date format, students were split between following this format "Often" or following it "Always," at all times asked; no students followed this date format "Sometimes" or less frequently.
Students, on average, reported that they "Sometimes" can distinguish files from one another or distinguish different versions of the same file data. After the program and 6 months following, they were "Often" able to distinguish files and file versions from one another (Figure 4). Interestingly, at any time point, only about 25% of students reported needing to open files in order to know what is in them.

Storage and backup
Within the domain of storage and backup, students were "Somewhat" to "Not Very" familiar with their options and best practices. After the program, they were "Very" familiar with all of these ( Figure 5). Most students also reported backing up their data (20/26, 77%), but after the program and 6 months following, all students who completed the assessment were backing up their data.
They were also consistent in frequency of backing up their data from "Sometimes" to "Often," but after the program, they improved in also backing up supporting documentation that explains their research data ( Figure 6).
Students were also asked what the relationship was between storage locations and the data that they contain in order to see whether those practices had changed as well. Figure 7 demonstrates that, before the program, most students who "backed up data" stored the data in many locations, but there was little relationship between storage location and individual or redundant data files, and after the program, more students had their data compiled at one location and copied to other locations, with no students needing to know which location their data was in in order to find it.

Sharing
Similar to what was seen in the "Standards and documentation" section, students were "Not Very" familiar with data sharing requirements by various stakeholders but became more so over time, particularly with journals ( Figure 8).  Regarding sharing practices, students were more often sharing their documentation than their research data. As they progressed, they increased sharing documentation through the 6 months, whereas sharing research data initially increased and dropped off, but not to the original levels ( Figure 9).

Self-estimate
At the end of the program, but not at 6 months, students were asked to provide a self-estimate of their knowledge of data management practices across the different domains. Most students reported that they overestimated their knowledge of organization and file management. Documentation, sharing, and storage knowledge were split between students who correctly estimated and those who overestimated. Whereas most students correctly estimated their knowledge of standards and policies, very few students underestimated their abilities in any domain ( Figure 10).

DISCUSSION
Because data management education is not yet mandatory in higher education, students are exposed to these concepts and practices in an inconsistent manner or through education that is disassociated from their day-to-day research work. Because of this, it has been difficult to determine instruction effectiveness. This study aimed to measure whether differences could be seen in data management knowledge and practices over time after data management education that was specifically aimed at learning and practicing within their research projects.
This pilot program shared similarities with both workshop series and data management courses. It allows data management principles to be taught in-depth because it takes place over 8 weeks, similar to many workshop series and academic quarters. The topics in this course are comparable in that they cover most of the data life cycle and the skills associated in each step. It also incentivizes student engagement-in this case, with compensation rather than with grades. This program differs in that the curriculum was designed to teach data management skills as interrelated and dependent skills, which has not been done before. It also uses a flipped classroom model, which has only been documented once (Johnston and Jeffryes, 2014b). These approaches may be contributing factors to increased knowledge and skills over the time surveyed.
Assessment for this program is also more specific than what has been typically administered and moves beyond measures of attendance and satisfaction. Similar to other authors' work (Schmidt and Holles, 2018;Thielen and Nichols Hess, 2017), this study relies on students' self-assessment, which is in line with the student research focus of the curriculum. The diversity in disciplines and research topic made it impractical to develop tests that objectively measure student knowledge. Considering that the students are usually accurate in estimating their knowledge (Figure 10), it is reasonable to trust their self-perception of knowledge and skills, which has been done in-depth here. This assessment also indicates that these skills and practices are retained beyond the immediate instruction period. Only Clement et al. (2017) have reported whether skills were sustained but that evaluation was in broader terms. The data presented here can be helpful to demonstrate library education's impact on a longer time scale regarding more specific skills.
Overall, this study shows improvement of students' knowledge and practices in almost all data management domains. For all categories investigated, students reported being initially "Not Very" or "Somewhat" familiar with foundational data management concepts. After education, the improvement was typically one rank higher to "Somewhat" or "Very" familiar, and this was maintained over time.
Looking granularly at acquisition and retention of individual practices within these domains, beginning with standards and documentation, this study shows that students increased familiarity with a variety of standards and expectations across stakeholder types. They also increased the creation of tools and various documentation types such as codebooks but struggled with maintaining documentation of analysis or coding. Not all students use code in their projects, but all are doing some sort of analysis. Future education will need to understand that, as students learn these skills, they need additional support to implement and maintain them. They are having difficulty on their own, even with initial time for implementation.
Similarly, within organization and file management, students pick up these skills readily and typically have some sort of scheme already in place. The question is whether those skills can be refined to be more effective for their projects. It appears that students acquire file-naming skills easily and use them consistently. On the other hand, students did not improve knowledge of software options or use. This topic was discussed in the program but not encouraged as a solution, as there is no institutional support for these products at any level at this institution, and the program did not want to impose a technical solution to a potentially collaborative workflow within the students' labs. Because of this, this is not a surprising result. However, this may be an area that needs to be developed. Research into marketplace needs regarding software use (academic, industrial, or other) would help align this education area.
Storage and backup also showed significant improvement. Most students were able to discern appropriate store options and backup practices. This is likely because these activities are a "set it and forget it" type, until circumstances change. This can certainly happen within a student project, but with options known and understood, or even anticipated, the decision becomes simpler to execute. Students are often constrained regarding storage and backup. It is known that most students store data on their personal devices, whether appropriate or not. With guidance, it seems that students will readily change their habits in this area.
Last, students' familiarity with data sharing does improve with education. Unsurprisingly, this aligns with stakeholders that are closest to them, such as journals rather than funders. They do not sustain sharing research data as much as sharing supporting materials. This study did not delve into who they share with and how. Further work might explain why sharing research data is not as strong. One factor may be where they are within the timeline of their projects. Within this program, students range between just starting their research to being within 6 months of graduation. It is possible that this data reflects those who have not progressed through their work enough to share. However, they are aware of the need to do so.
Based on the student self-estimate, students are accurate in their ability to determine their needs and estimate their knowledge and practices. They overestimated knowledge in domains that were already familiar to them but correctly estimated within domains that were less familiar. Because students elect to participate in a program like this one, they have already identified that there could be improvements, and so this is an unsurprising result. If programs like this become mandatory, it would be interesting to see the changes in estimates, as students who believed that they already have adequate knowledge and practices (and would therefore not take a data management course) would now be included in the data.
This work supports the hypothesis that centering students' research projects in data management education has lasting effects, but further studies directly comparing programs that center students work versus those that do not will be needed.

Limitations
Because of the small number of participants, stronger statistical assessment of the data was not appropriate. The small number and the focus on STEM and health science graduate students from one institution prevent the data from being generalized to all graduate students. Students were being compensated for participation in the program and knew that there would be a follow-up survey at 6 months. It is possible that the compensation and the expectation of follow-up affected their practices. Because of the diverse discipline representation among the student participants and the lack of data management standards across disciplines, a direct assessment of students' skills was not possible at this time.  summarize well some of the challenges associated with data management education.

CONCLUSION
"Time is needed to explore data management topics in-depth…Shorter education opportunities such as workshops or seminars cannot cover the same breadth of topics, and do not give students the opportunity to apply or practice…Through the progression of the course, students see how topics are connected and related" (p. 9).
Because data management education is elective, is offered as workshops, and is not linked to other incentives, it can be difficult to evaluate learning. Whereas workshops serve their primary purpose to teach data management principles and practices, they also serve a secondary purpose to raise awareness of the library as a data management service provider. This is seen in assessments that measure attendance and general satisfaction, as well as ask for suggestions for improvement. These assessment types rely on satisfaction in order to justify spending personnel time and/or money but often do not directly assess student skills.
Data management education has progressed to the point at which competencies have been developed (Calzada Prado and Marzal, 2013;Piorun et al., 2012;Qin and D'Ignazio, 2010;Schneider, 2013). Also, a pilot competency matrix for scaffolding data management skill development has been proposed (Sapp Nelson, 2017). Most workshops and formal courses on data management align with one of these competency frameworks, with the Association of College and Research Libraries framework for information literacy and/or with discipline accreditation.
Aligning with competencies and accreditation standards, embedding data management in courses, or offering them as standalone courses can help to determine whether data management skills are being acquired. However, this is hindered by the lack of data management infrastructure at any level and the lack of established standards to measure against. It is likely that we will have the opportunity to further measure data management skills in the coming years, as expectations for data management increase and the developed competencies are applied. This program and granular assessment of student knowledge and practice change contributes to this effort.
Requirements for data sharing and open science are increasing through all levels of the academic enterprise. Funders are requiring data sharing to create transparency and trust with the public or stakeholders and to reduce financial waste, journals are requiring it to bolster the peer review process and create a pipeline for "value added" services, and researchers themselves are engaging in it, as the complexity of data prohibits "single author" science to exist, among other reasons. However, despite all these drivers, data management education has not yet been prioritized. Libraries have stepped up to fill this gap as best they can but are hindered by the need to justify their place in this arena and to entice students to participate. Therefore, measures of satisfaction can only go so far. Steps that move us to more accurately measure students' data management skills may lead us to data that can be leveraged within our institutions to argue for data management education to be considered a core skill and a contributor to student success overall.

HUMAN SUBJECTS RESEARCH
Assessment of student knowledge, confidence, and behaviors regarding data management and artifacts produced during this program were reviewed by the Institutional Review Board at the University of Illinois Chicago and were determined to be exempt (#2019-0048). Informed consent was obtained from all students included in this program.

FUNDING STATEMENT
Program development and execution, student compensation for participation, and assessment of instrument development were supported by a grant from the National Networks of Libraries of Medicine-Greater Midwest Region's (NNLM-GMR) Research Data Award and the National Institutes of Health in a cooperative agreement (1UG4LM012346). The content is solely the responsibility of the author and does not necessarily represent the official views of the National Institutes of Health.
APPENDIX A The three surveys were identical except that students were asked to estimate their knowledge and practices at the immediate program end. The full survey is approximately 120 questions. Not all questions were analyzed for this manuscript. Questions regarding confidence and associated behaviors will be reported in future work and are also excluded from this appendix. The questions used for this analysis are included below. (Please note, the questions numbers were not visible to the participants and are not sequential. Questions are displayed here in the order they were presented to the participants.) Q1 This survey will compare the change in confidence, knowledge, and practices you currently have regarding data management relative to when you began the program.   (1)