8 results on '"Will Brackenbury"'
Search Results
2. Explaining Why: How Instructions and User Interfaces Impact Annotator Rationales When Labeling Text Data
- Author
-
Jamar Sullivan Jr., Will Brackenbury, Andrew McNutt, Kevin Bryson, Kwam Byll, Yuxin Chen, Michael Littman, Chenhao Tan, and Blase Ur
- Published
- 2022
- Full Text
- View/download PDF
3. KondoCloud: Improving Information Management in Cloud Storage via Recommendations Based on File Similarity
- Author
-
Kyle Chard, Andrew McNutt, Blase Ur, Will Brackenbury, and Aaron J. Elmore
- Subjects
Information management ,World Wide Web ,Metadata ,Interface (Java) ,Computer science ,Similarity (psychology) ,Data_FILES ,Personal information management ,Leverage (statistics) ,Cloud storage ,Personal cloud - Abstract
Users face many challenges in keeping their personal file collections organized. While current file-management interfaces help users retrieve files in disorganized repositories, they do not aid in organization. Pertinent files can be difficult to find, and files that should have been deleted may remain. To help, we designed KondoCloud, a file-browser interface for personal cloud storage. KondoCloud makes machine learning-based recommendations of files users may want to retrieve, move, or delete. These recommendations leverage the intuition that similar files should be managed similarly. We developed and evaluated KondoCloud through two complementary online user studies. In our Observation Study, we logged the actions of 69 participants who spent 30 minutes manually organizing their own Google Drive repositories. We identified high-level organizational strategies, including moving related files to newly created sub-folders and extensively deleting files. To train the classifiers that underpin KondoCloud’s recommendations, we had participants label whether pairs of files were similar and whether they should be managed similarly. In addition, we extracted ten metadata and content features from all files in participants’ repositories. Our logistic regression classifiers all achieved F1 scores of 0.72 or higher. In our Evaluation Study, 62 participants used KondoCloud either with or without recommendations. Roughly half of participants accepted a non-trivial fraction of recommendations, and some participants accepted nearly all of them. Participants who were shown the recommendations were more likely to delete related files located in different directories. They also generally felt the recommendations improved efficiency. Participants who were not shown recommendations nonetheless manually performed about a third of the actions that would have been recommended.
- Published
- 2021
- Full Text
- View/download PDF
4. Files of a Feather Flock Together? Measuring and Modeling How Users Perceive File Similarity in Cloud Storage
- Author
-
Galen Harrison, Kyle Chard, Blase Ur, Will Brackenbury, and Aaron J. Elmore
- Subjects
Information retrieval ,Similarity (network science) ,Computer science ,business.industry ,Data management ,Data_FILES ,Personal information management ,Filesystem Hierarchy Standard ,Cloud computing ,Directory ,business ,Cloud storage ,Through-the-lens metering - Abstract
Prior work suggests that users conceptualize the organization of personal collections of digital files through the lens of similarity. However, it is unclear to what degree similar files are actually located near one another (e.g., in the same directory) in actual file collections, or whether leveraging file similarity can improve information retrieval and organization for disorganized collections of files. To this end, we conducted an online study combining automated analysis of 50 Google Drive and Dropbox users' cloud accounts with a survey asking about pairs of files from those accounts. We found that many files located in different parts of file hierarchies were similar in how they were perceived by participants, as well as in their algorithmically extractable features. Participants often wished to co-manage similar files (e.g., deleting one file implied deleting the other file) even if they were far apart in the file hierarchy. To further understand this relationship, we built regression models, finding several algorithmically extractable file features to be predictive of human perceptions of file similarity and desired file co-management. Our findings pave the way for leveraging file similarity to automatically recommend access, move, or delete operations based on users' prior interactions with similar files.
- Published
- 2021
- Full Text
- View/download PDF
5. CYADB
- Author
-
Zechao Shang, Michael J. Franklin, Will Brackenbury, and Aaron J. Elmore
- Subjects
Database ,Computer science ,media_common.quotation_subject ,General Engineering ,computer.software_genre ,Missing data ,Data point ,Ask price ,Completeness (order theory) ,Data quality ,Domain knowledge ,Quality (business) ,Tuple ,computer ,media_common - Abstract
Data completeness is becoming a significant roadblock in data quality. Existing research in this area currently handles the certainty of a query by ignoring the incomplete part and approximating missing attributes on partially complete tuples, but leaves open the question of how the missing data affect the quality of the results. This is particularly challenging when entire tuples are absent, which can affect query certainty in ways that are not immediately obvious. To aid this, we propose cyadb , a database that "covers your ask" by assessing the quality of a query answer when data are missing. cyadb is a human-in-the-loop system, in which the data owner utilizes his or her domain knowledge of data to specify aspects of the missing data, such as where it might be missing ("where"), how many data points are missing ("how many"), and how large the missing data points could be in comparison to the provided data ("how big"). Using this, cyadb calculates the query's missing sensitivity, the maximal size of the effect that the missing data could have on the given query. Additionally, cyadb provides concrete examples of missing data that match the missing sensitivity to help the user interactively refine the provided domain knowledge.
- Published
- 2018
- Full Text
- View/download PDF
6. How Users Interpret Bugs in Trigger-Action Programming
- Author
-
Michael L. Littman, Guan Wang, Abhimanyu Deora, Will Brackenbury, Weijia He, Blase Ur, Jillian Ritchey, and Jason Vallee
- Subjects
Computer science ,media_common.quotation_subject ,05 social sciences ,020207 software engineering ,02 engineering and technology ,Trigger action programming ,Control flow ,Debugging ,Human–computer interaction ,0202 electrical engineering, electronic engineering, information engineering ,Programming paradigm ,0501 psychology and cognitive sciences ,050107 human factors ,media_common - Abstract
Trigger-action programming (TAP) is a programming model enabling users to connect services and devices by writing if-then rules. As such systems are deployed in increasingly complex scenarios, users must be able to identify programming bugs and reason about how to fix them. We first systematize the temporal paradigms through which TAP systems could express rules. We then identify ten classes of TAP programming bugs related to control flow, timing, and inaccurate user expectations. We report on a 153-participant online study where participants were assigned to a temporal paradigm and shown a series of pre-written TAP rules. Half of the rules exhibited bugs from our ten bug classes. For most of the bug classes, we found that the presence of a bug made it harder for participants to correctly predict the behavior of the rule. Our findings suggest directions for better supporting end-user programmers.
- Published
- 2019
- Full Text
- View/download PDF
7. Draining the Data Swamp
- Author
-
Mainack Mondal, Kyle Chard, Michael J. Franklin, Will Brackenbury, Aaron J. Elmore, Rui Liu, and Blase Ur
- Subjects
Computer science ,business.industry ,Data management ,Data discovery ,020207 software engineering ,Context (language use) ,02 engineering and technology ,Information repository ,Data science ,Natural (archaeology) ,020204 information systems ,Similarity (psychology) ,0202 electrical engineering, electronic engineering, information engineering ,Production (economics) ,Cluster analysis ,business - Abstract
While hierarchical namespaces such as filesystems and repositories have long been used to organize data, the rapid increase in data production places increasing strain on users who wish to make use of the data. So called "data lakes" embrace the storage of data in its natural form, integrating and organizing in a Pay-as-you-go fashion. While this model defers the upfront cost of integration, the result is that data is unusable for discovery or analysis until it is processed. Thus, data scientists are forced to spend significant time and energy on mundane tasks such as data discovery, cleaning, integration, and management -- when this is neglected, "data lakes" become "data swamps."Prior work suggests that pure computational methods for resolving issues with the data discovery and management components are insufficient. Here, we provide evidence to confirm this hypothesis, showing that methods such as automated file clustering are unable to extract the necessary features from repositories to provide useful information to end-user data scientists, or make effective data management decisions on their behalf. We argue that the combination of frameworks for specifying file similarity and human-in-the-loop interaction is needed to aid automated organization. We propose an initial step here, classifying several dimensions by which items may be considered similar: the data, its origin, and its current characteristics. We initially consider this model in the context of identifying data that can be integrated or managed collectively. We additionally explore how current methods can be used to automate decision making using real-world data repository and file systems, and suggest how an online user study could be developed to further validate this hypothesis.
- Published
- 2018
- Full Text
- View/download PDF
8. k -regret queries with nonlinear utilities
- Author
-
Taylor Kessler Faulkner, Ashwin Lall, and Will Brackenbury
- Subjects
Nonlinear system ,Mathematical optimization ,Computer science ,Constant elasticity of substitution ,General Engineering ,Regret ,Diminishing returns ,Upper and lower bounds - Abstract
In exploring representative databases, a primary issue has been finding accurate models of user preferences. Given this, our work generalizes the method of regret minimization as proposed by Nanongkai et al. to include nonlinear utility functions. Regret minimization is an approach for selecting k representative points from a database such that every user's ideal point in the entire database is similar to one of the k points. This approach combines benefits of the methods top- k and skyline; it controls the size of the output but does not require knowledge of users' preferences. Prior work with k -regret queries assumes users' preferences to be modeled by linear utility functions. In this paper, we derive upper and lower bounds for nonlinear utility functions, as these functions can better fit occurrences such as diminishing marginal returns, propensity for risk, and substitutability of preferences. To model these phenomena, we analyze a broad subset of convex, concave, and constant elasticity of substitution functions. We also run simulations on real and synthetic data to prove the efficacy of our bounds in practice.
- Published
- 2015
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.