DevOps toolchain

DevOps toolchain

A DevOps toolchain is a set or combination of tools that aid in the delivery, development, and management of software applications throughout the systems development life cycle, as coordinated by an organization that uses DevOps practices. Generally, DevOps tools fit into one or more activities, which supports specific DevOps initiatives: Plan, Create, Verify, Package, Release, Configure, Monitor, and Version Control. == Toolchains == In software, a toolchain is the set of programming tools that is used to perform a complex software development task or to create a software product, which is typically another computer program or a set of related programs. In general, the tools forming a toolchain are executed consecutively so the output or resulting environment state of each tool becomes the input or starting environment for the next one, but the term is also used when referring to a set of related tools that are not necessarily executed consecutively. As DevOps is a set of practices that emphasizes the collaboration and communication of both software developers and other information technology (IT) professionals, while automating the process of software delivery and infrastructure changes, its implementation can include the definition of the series of tools used at various stages of the lifecycle; because DevOps is a cultural shift and collaboration between development and operations, there is no one product that can be considered a single DevOps tool. Instead a collection of tools, potentially from a variety of vendors, are used in one or more stages of the lifecycle. == Stages of DevOps == === Plan === Plan consists of two elements: "define" and "plan". This activity refers to the business value and application requirements. Specifically "Plan" activities include: Production metrics, objects and feedback Requirements Business metrics Update release metrics Release plan, timing and business case Security policy and requirement A combination of the IT personnel will be involved in these activities: business application owners, software development, software architects, continual release management, security officers and the organization responsible for managing the production of IT infrastructure. === Create === Create consists of the building, coding, and configuring of the software development process. The specific activities are: Design of the software and configuration Coding including code quality and performance Software build and build performance Release candidate Tools and vendors in this category often overlap with other categories. Because DevOps is about breaking down silos, this is reflective in the activities and product solutions. === Verify === Verify is directly associated with ensuring the quality of the software release; activities designed to ensure code quality is maintained and the highest quality is deployed to production. The main activities in this are: Acceptance testing Regression testing Security and vulnerability analysis Performance Configuration testing Solutions for verify-related activities generally fall under four main categories: Test automation, Static analysis, Test Lab, and Security. === Package === Package refers to the activities involved once the release is ready for deployment, often also referred to as staging or Preproduction / "preprod". This often includes tasks and activities such as: Approval/preapprovals Package configuration Triggered releases Release staging and holding === Release === Release related activities include schedule, orchestration, provisioning and deploying software into production and targeted environment. The specific Release activities include: Release coordination Deploying and promoting applications Fallbacks and recovery Scheduled/timed releases Solutions that cover this aspect of the toolchain include application release automation, deployment automation and release management. === Configure === Configure activities fall under the operation side of DevOps. Once software is deployed, there may be additional IT infrastructure provisioning and configuration activities required. Specific activities including: Infrastructure storage, database and network provisioning and configuring Application provision and configuration. The main types of solutions that facilitate these activities are continuous configuration automation, configuration management, and infrastructure as code tools. === Monitor === Monitoring is an important link in a DevOps toolchain. It allows IT organization to identify specific issues of specific releases and to understand the impact on end-users. A summary of Monitor related activities are: Performance of IT infrastructure End-user response and experience Production metrics and statistics Information from monitoring activities often impacts Plan activities required for changes and for new release cycles. === Version Control === Version Control is an important link in a DevOps toolchain and a component of software configuration management. Version Control is the management of changes to documents, computer programs, large web sites, and other collections of information. A summary of Version Control related activities are: Non-linear development Distributed development Compatibility with existent systems and protocols Toolkit-based design Information from Version Control often supports Release activities required for changes and for new release cycles.

Server.com

Server.com is a domain name that was owned by software as a service (SaaS) company Server Corporation. They offered a suite of services from 1996 until 2007. It was the first SaaS site to offer a variety of services and the first to use the term WebApp to describe its services. It was selected as an Incredibly Useful Site by Yahoo! Internet Life magazine. net magazine listed Server.com among the 100 most influential websites of all time. Server.com launched in 1996 offering the first online personal information manager. In 1997, they rolled out the first threaded message board service; the first web based mailing list manager; one of the first online calendar services; and one of the first online form builders. In 2000, Server.com partnered with NBCi and became server.snap.com until 2001. In 2001, Server.com was serving 100 million monthly pageviews. Media Life declared it one of the 20 biggest ad domains on the Web. In 2002, Server.com developed one of the first web-based RSS aggregators. In 2007, all services were moved to YourWebApps.com. The domain name Server.com was sold in 2009 for $770,000.

Data profiling

Data profiling is the process of examining the data available from an existing information source (e.g. a database or a file) and collecting statistics or informative summaries about that data. The purpose of these statistics may be to: Find out whether existing data can be easily used for other purposes Improve the ability to search data by tagging it with keywords, descriptions, or assigning it to a category Assess data quality, including whether the data conforms to particular standards or patterns Assess the risk involved in integrating data in new applications, including the challenges of joins Discover metadata of the source database, including value patterns and distributions, key candidates, foreign-key candidates, and functional dependencies Assess whether known metadata accurately describes the actual values in the source database Understanding data challenges early in any data intensive project, so that late project surprises are avoided. Finding data problems late in the project can lead to delays and cost overruns. Have an enterprise view of all data, for uses such as master data management, where key data is needed, or data governance for improving data quality. == Introduction == Data profiling refers to the analysis of information for use in a data warehouse in order to clarify the structure, content, relationships, and derivation rules of the data. Profiling helps to not only understand anomalies and assess data quality, but also to discover, register, and assess enterprise metadata. The result of the analysis is used to determine the suitability of the candidate source systems, usually giving the basis for an early go/no-go decision, and also to identify problems for later solution design. == How data profiling is conducted == Data profiling utilizes methods of descriptive statistics such as minimum, maximum, mean, mode, percentile, standard deviation, frequency, variation, aggregates such as count and sum, and additional metadata information obtained during data profiling such as data type, length, discrete values, uniqueness, occurrence of null values, typical string patterns, and abstract type recognition. The metadata can then be used to discover problems such as illegal values, misspellings, missing values, varying value representation, and duplicates. Different analyses are performed for different structural levels. E.g. single columns could be profiled individually to get an understanding of frequency distribution of different values, type, and use of each column. Embedded value dependencies can be exposed in a cross-columns analysis. Finally, overlapping value sets possibly representing foreign key relationships between entities can be explored in an inter-table analysis. Normally, purpose-built tools are used for data profiling to ease the process. The computational complexity increases when going from single column, to single table, to cross-table structural profiling. Therefore, performance is an evaluation criterion for profiling tools. == When is data profiling conducted? == According to Kimball, data profiling is performed several times and with varying intensity throughout the data warehouse developing process. A light profiling assessment should be undertaken immediately after candidate source systems have been identified and DW/BI business requirements have been satisfied. The purpose of this initial analysis is to clarify at an early stage if the correct data is available at the appropriate detail level and that anomalies can be handled subsequently. If this is not the case the project may be terminated. Additionally, more in-depth profiling is done prior to the dimensional modeling process in order assess what is required to convert data into a dimensional model. Detailed profiling extends into the ETL system design process in order to determine the appropriate data to extract and which filters to apply to the data set. Additionally, data profiling may be conducted in the data warehouse development process after data has been loaded into staging, the data marts, etc. Conducting data at these stages helps ensure that data cleaning and transformations have been done correctly and in compliance of requirements. == Benefits and examples == Data profiling can improve data quality, shorten the implementation cycle of major projects, and improve users' understanding of data. Discovering business knowledge embedded in data itself is one of the significant benefits derived from data profiling. It can improve data accuracy in corporate databases.

Utah Social Media Regulation Act

S.B. 152 and H.B. 311, collectively known as the Utah Social Media Regulation Act, were social media regulation bills that were passed by the Utah State Legislature in March 2023. The bills would have collectively imposed restrictions on how social networking services serve minors in the state of Utah, including mandatory age verification and age restrictions, as well as restrictions on data collection and on algorithmic recommendations. The Act was intended to take effect in March 2024. However, following a lawsuit over the Act by NetChoice, a tech industry lobby group, the Utah attorney general stated in January 2024 that its implementation had been delayed to October 2024, but was likely to be repealed and amended. On September 10, 2024 Chief Judge Robert J. Shelby issued a written order granting a request from NetChoice for a preliminary injunction, meaning that Utah will be unable to enforce its social media law as litigation plays out. The law was appealed to the 10th Circuit on October 11, 2024 and is awaiting a decision. == Provisions == The Act comprises two bills, S.B. 152 and H.B. 311, which respectively regulate access to social network accounts registered to minors, and impose obligations on social networking services to follow design practices that protect the privacy of minors. The bills would apply to social networks with more than 5 million active users in the United States. Social networking services would've verified the age of all users in the state of Utah, or else their account must've been deleted. The Act does not specify a specific method of age verification. Users who are under 18 must have consent from a parent or guardian to open an account, and the parent must be able to have access to the account and its data for monitoring. Unless required to comply with state or federal law, social networks were prohibited from collecting data based on the activity of minors, and may've not displayed targeted advertising or algorithmic recommendations of content, users, or groups to minors. A social network must not allow minors to access the service between the hours of 10:30 p.m., and 6:30 a.m. without parental consent. H.B. 311 prohibits social networks from exposing features to minors that cause them to have an "addiction" to the platform; the service must perform quarterly audits, and may be sued by users for harms caused by providing "addictive" features; there is a rebuttable presumption of harm if the plaintiff is 16 or younger. The bills prescribed fines of $2,500 per-violation for violations of the provisions of S.B. 152, and up to $250,000 in liabilities (plus fines of $2,500 per-user) for violations of the addiction rules. == History == The two bills were passed in early-March 2023, and signed by Governor Spencer Cox on March 23, 2023. Cox cited studies linking social media addiction to increases in depression and suicide among youth. They were originally intended to take effect on March 1, 2024. In the wake of a lawsuit in Arkansas by the trade association NetChoice over a similar bill, state senator and bill author Mike McKell stated that he planned to introduce amendments when the legislature resumed in 2024. In December 2023, NetChoice filed a lawsuit in Utah seeking to block the Act, citing that its definition of a social network was too vague, and that it "restricts who can express themselves, what can be said, and when and how speech on covered websites can occur, down to the very hours of the day minors can use covered websites. The First Amendment, reinforced by decades of precedent, allows none of this." In regards to its age verification requirements, NetChoice argued that "it may not be enough to simply verify the age of whatever person may be listed on a form of identification (even if they have such a record) because that record may not accurately reflect who the individual actually is." The office of the attorney general stated that the state was "reviewing the lawsuit but remains intently focused on the goal of this legislation: Protecting young people from negative and harmful effects of social media use." In January 2024, Attorney General Sean Reyes asked the court to delay a hearing over the bill, stating that its effective date had been delayed to October 2024, and that the legislature planned to repeal and replace the bills. On September 10, 2024, Federal Chief Judge Robert Shelby granted a preliminary injunction to stop enforcement of the law as litigation continues. The law was later appealed on October 11, 2024, by the state of Utah and had a court hearing on the appeal on November 20, 2025.

Social media as a news source

Social media as a news source is defined as the use of online social media platforms such as Instagram, TikTok, and Facebook rather than the use of traditional media platforms like the newspaper or live TV to obtain news. Television had just begun to turn a nation of people who once listened to media content into watchers of media content between the 1950s and the 1980s when the popularity of social media had also begun creating a nation of media content creators. Almost half of Americans use social media as a news source, according to the Pew Research Center. As social media's role in news consumption grows, questions have emerged about its impact on knowledge, the formation of echo chambers, and the effectiveness of fact-checking efforts in combating misinformation. Social media platforms allow user-generated content and sharing content within one's own virtual network. Using social media as a news source allows users to engage with news in a variety of ways including: Consuming and discovering news Sharing or reposting news Posting one's own photos, videos, or reports of news (i.e., engage in citizen or participatory journalism) Commenting on news posts Using social media as a news source has become an increasingly popular way for people of all age groups to obtain current and important information. Just like many other new forms of technology there are going to be pros and cons. There are ways that social media positively affects the world of news and journalism but it is important to acknowledge that there are also ways in which social media has a negative effect on the news. With this accessibility, people now have more ways to consume false news, biased news, and even disturbing content. In 2019, the Pew Research Center created a poll that reported Americans are wary about the ways that social media sites share news and certain content. This wariness of accuracy grew as awareness that social media sites could be exploited by bad actors who concoct false narratives and fake news. == Relationship to traditional news sources == Unlike traditional news platforms such as newspapers and news shows, social media platforms allow people without professional journalistic backgrounds to create news and cover events that news agencies might not cover. Social media users may read a set of news that differs slightly from what newspaper editors prioritize in the print press. A 2019 study found that Facebook and Twitter users are more likely to share politics, public affairs, and visual media news. Typically social media users circulate more towards posting about negative news. A study of tweets found that while optimistic-sounding and neutral-sounding tweets were equally likely to express certainty or uncertainty, the pessimistic tweets were nearly twice as likely to appear certain of an outcome than uncertain. These results could imply that posts of a more pessimistic nature that are also written with an air of certainty are more likely to be shared or otherwise permeate groups on Twitter. A similar bias towards negativity has developed on Facebook, where internal memos revealed that an algorithm built to promote "meaningful social interaction" actually incentivized publishers to promote negative and sensational news. Biases towards negativity need to be considered when the utility of new media is addressed, as the potential for human opinion to overemphasize any particular news story is greater despite general improvement. In order to compete in this rapidly changing technological environment, there has been an upheaval of traditional news sources onto online spaces. The production and circulation of newspaper prints have continued to globally decline in accordance with the increasing presence of news outlets on social media. Prominent platforms such as Twitter and Facebook have been key in engaging users through the integration of journalistic news into their newsfeeds. This feature has now become a foundational part of these apps' interfaces. Social media incentivizes both legacy news brands and individual professional journalists to share their reporting and interact with audiences on social platforms to boost engagement. However, most people who consume news on social media report that accessing news is not their main motivation for being on social media, but rather, they see and consume news incidentally. Nonetheless, informational interviews reveal that these consumers rely on being informed through social media. Some news consumers attest that a news brand's participation in social media does not improve their trust in the brand and that more in-depth reporting and more transparency about biases would improve trust instead. == Use as a news source == Globally, data from 2020 shows that over 70% of adult participants from Kenya, South Africa, Chile, Bulgaria, Greece, and Argentina utilized social media for news while those from France, the UK, the Netherlands, Germany, and Japan were reportedly less than 40 percent. According to the Pew Research Center, 20% of adults in the United States in 2018 said they get their news from social media "often," compared to 16% who said they often get news from print newspapers, 26% who often get it from the radio, 33% who often get it from news websites, and 49% who often get it from TV. The same survey found that social media was the most popular way for American adults age 18–29 to get news, the second-to-last most popular way for Americans age 20–49 to get news, and the least popular way for American adults age 50-64 and 65+ to get the news. In 2019, the Pew Research Center found that over half of Americans (54%) either got their news "sometimes" or "often" from social media, and Facebook was the most popular social media site where American adults got their news. However, at least 50% off all respondents reported that the following were either a "very big problem" or a "moderately big problem" for getting news on social media: One-sided news (83%) Inaccurate news (81%) Censorship of the news (69%) Uncivil discussions about the news (69%) Harassment of journalists (57%) News organizations or personalities being banned (53%) Violent or disturbing news images or videos (51%) In a later survey from the same year, the Pew Research Center reported that 18% of American adults reported that the most common way they get news about politics and the election was from social media. Additional source information shows that from politics and the United States presidential election in 2016, the popularity of fake news had grown to global attention. With this information, the study explains that more than 60 percent of adults receive their news from social media, the most popular being Facebook. With the increase of fake news, and the large amount of adult participation on these social media sites, it made it much harder for those who were searching for news to find a source that they could find credible. Another study found that adult participants found their own friends on Facebook to be a more reliable source of information online compared to a professional news organization. Although, when news was posted by a news organization online, they were then found more reliable compared to when they are shared by their online friends. Showing that adult participants found that the news that was only posted on Facebook and social media was much more credible to them than compared to other forms of information spreading. The study further states that these outcomes have the potential explanation that the topic of the news article played a part in the ways they were affected. This could have affected the way adult participants interacted with the different news sources, such as their online friends compared to a news organization, prominently because depending on the story, they want to have the correct information about the news from the most credible source. === By young people === Social media platforms are some of the most easily accessible forms of news and with the growing generations, the technology is only going to grow. With that, the use of social media in younger generations is also going to grow alongside it. Technology in the hands of young kids can be a concern moving into the future. Globally, there is evidence that through social media, youth have become more directly involved in protests, social campaigns and generally, in the sharing of news across multiple platforms. The number of people who use social media platforms such as Twitter, Facebook, Instagram, or Snapchat as ways to seek information has increased significantly in recent years especially for people who are part of the younger generation.TikTok is a rapidly expanding platform that young adults can use to find news content on social media. TikTok is one of the sites that young adults and teens utilize to get news about trending themes and controversial topics. The younger generation accepts without hesitation the information that thei

Isotropic position

In the fields of machine learning, the theory of computation, and random matrix theory, a probability distribution over vectors is said to be in isotropic position if its covariance matrix is proportional to the identity matrix. == Formal definitions == Let D {\textstyle D} be a distribution over vectors in the vector space R n {\textstyle \mathbb {R} ^{n}} . Then D {\textstyle D} is in isotropic position if, for vector v {\textstyle v} sampled from the distribution, E v v T = I d . {\displaystyle \mathbb {E} \,vv^{\mathsf {T}}=\mathrm {Id} .} A set of vectors is said to be in isotropic position if the uniform distribution over that set is in isotropic position. In particular, every orthonormal set of vectors is isotropic. As a related definition, a convex body K {\textstyle K} in R n {\textstyle \mathbb {R} ^{n}} is called isotropic if it has volume | K | = 1 {\textstyle |K|=1} , center of mass at the origin, and there is a constant α > 0 {\textstyle \alpha >0} such that ∫ K ⟨ x , y ⟩ 2 d x = α 2 | y | 2 , {\displaystyle \int _{K}\langle x,y\rangle ^{2}dx=\alpha ^{2}|y|^{2},} for all vectors y {\textstyle y} in R n {\textstyle \mathbb {R} ^{n}} ; here | ⋅ | {\textstyle |\cdot |} stands for the standard Euclidean norm.

Clustered file system

A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for each node). Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance. == Shared-disk file system == A shared-disk file system uses a storage area network (SAN) to allow multiple computers to gain direct disk access at the block level. Access control and translation from file-level operations that applications use to block-level operations used by the SAN must take place on the client node. The most common type of clustered file system, the shared-disk file system – by adding mechanisms for concurrency control – provides a consistent and serializable view of the file system, avoiding corruption and unintended data loss even when multiple clients try to access the same files at the same time. Shared-disk file-systems commonly employ some sort of fencing mechanism to prevent data corruption in case of node failures, because an unfenced device can cause data corruption if it loses communication with its sister nodes and tries to access the same information other nodes are accessing. The underlying storage area network may use any of a number of block-level protocols, including SCSI, iSCSI, HyperSCSI, ATA over Ethernet (AoE), Fibre Channel, network block device, and InfiniBand. There are different architectural approaches to a shared-disk filesystem. Some distribute file information across all the servers in a cluster (fully distributed). === Examples === == Distributed file systems == Distributed file systems do not share block level access to the same storage but use a network protocol. These are commonly known as network file systems, even though they are not the only file systems that use the network to send data. Distributed file systems can restrict access to the file system depending on access lists or capabilities on both the servers and the clients, depending on how the protocol is designed. The difference between a distributed file system and a distributed data store is that a distributed file system allows files to be accessed using the same interfaces and semantics as local files – for example, mounting/unmounting, listing directories, read/write at byte boundaries, system's native permission model. Distributed data stores, by contrast, require using a different API or library and have different semantics (most often those of a database). === Design goals === Distributed file systems may aim for "transparency" in a number of aspects. That is, they aim to be "invisible" to client programs, which "see" a system which is similar to a local file system. Behind the scenes, the distributed file system handles locating files, transporting data, and potentially providing other features listed below. Access transparency: clients are unaware that files are distributed and can access them in the same way as local files are accessed. Location transparency: a consistent namespace exists encompassing local as well as remote files. The name of a file does not give its location. Concurrency transparency: all clients have the same view of the state of the file system. This means that if one process is modifying a file, any other processes on the same system or remote systems that are accessing the files will see the modifications in a coherent manner. Failure transparency: the client and client programs should operate correctly after a server failure. Heterogeneity: file service should be provided across different hardware and operating system platforms. Scalability: the file system should work well in small environments (1 machine, a dozen machines) and also scale gracefully to bigger ones (hundreds through tens of thousands of systems). Replication transparency: Clients should not have to be aware of the file replication performed across multiple servers to support scalability. Migration transparency: files should be able to move between different servers without the client's knowledge. === History === The Incompatible Timesharing System used virtual devices for transparent inter-machine file system access in the 1960s. More file servers were developed in the 1970s. In 1976, Digital Equipment Corporation created the File Access Listener (FAL), an implementation of the Data Access Protocol as part of DECnet Phase II which became the first widely used network file system. In 1984, Sun Microsystems created the file system called "Network File System" (NFS) which became the first widely used Internet Protocol based network file system. Other notable network file systems are Andrew File System (AFS), Apple Filing Protocol (AFP), NetWare Core Protocol (NCP), and Server Message Block (SMB) which is also known as Common Internet File System (CIFS). In 1986, IBM announced client and server support for Distributed Data Management Architecture (DDM) for the System/36, System/38, and IBM mainframe computers running CICS. This was followed by the support for IBM Personal Computer, AS/400, IBM mainframe computers under the MVS and VSE operating systems, and FlexOS. DDM also became the foundation for Distributed Relational Database Architecture, also known as DRDA. There are many peer-to-peer network protocols for open-source distributed file systems for cloud or closed-source clustered file systems, e. g.: 9P, AFS, Coda, CIFS/SMB, DCE/DFS, WekaFS, Lustre, PanFS, Google File System, Mnet, Chord Project. === Examples === == Network-attached storage == Network-attached storage (NAS) provides both storage and a file system, like a shared disk file system on top of a storage area network (SAN). NAS typically uses file-based protocols (as opposed to block-based protocols a SAN would use) such as NFS (popular on UNIX systems), SMB/CIFS (Server Message Block/Common Internet File System) (used with MS Windows systems), AFP (used with Apple Macintosh computers), or NCP (used with OES and Novell NetWare). == Design considerations == === Avoiding single point of failure === The failure of disk hardware or a given storage node in a cluster can create a single point of failure that can result in data loss or unavailability. Fault tolerance and high availability can be provided through data replication of one sort or another, so that data remains intact and available despite the failure of any single piece of equipment. For examples, see the lists of distributed fault-tolerant file systems and distributed parallel fault-tolerant file systems. === Performance === A common performance measurement of a clustered file system is the amount of time needed to satisfy service requests. In conventional systems, this time consists of a disk-access time and a small amount of CPU-processing time. But in a clustered file system, a remote access has additional overhead due to the distributed structure. This includes the time to deliver the request to a server, the time to deliver the response to the client, and for each direction, a CPU overhead of running the communication protocol software. === Concurrency === Concurrency control becomes an issue when more than one person or client is accessing the same file or block and want to update it. Hence updates to the file from one client should not interfere with access and updates from other clients. This problem is more complex with file systems due to concurrent overlapping writes, where different writers write to overlapping regions of the file concurrently. This problem is usually handled by concurrency control or locking which may either be built into the file system or provided by an add-on protocol. == History == IBM mainframes in the 1970s could share physical disks and file systems if each machine had its own channel connection to the drives' control units. In the 1980s, Digital Equipment Corporation's TOPS-20 and OpenVMS clusters (VAX/ALPHA/IA64) included shared disk file systems.