Zacharias 🐝 Voulgaris

3 anni fa · 2 min. di lettura · ~10 ·

Blogging

>

Il blog di Zacharias 🐝

>

Data Synthetics without A.I. and Why This Adds Value to You as an Individual

Data Synthetics is a term I coined to refer to the framework/processes related to synthesizing data (instead of just analyzing it). It's by far the most significant thing in data science today and one of the many applications of A.I.; namely, specialized systems generating data based on a given dataset, all while maintaining the properties of the original dataset. But isn't there an abundance of data out there? Well, yes, but we could always use some more. This rationale is much like the work of a fiction writer. The latter often fancies creating her own characters for a novel or a short story even though there are plenty of real-world characters out there she could copy and include in her text. So, if you don't want to be part of someone else's work of fiction (especially if that gets published and read by many other people), you may want to abstain from having your personally identifiable information (PII) roaming free in the world. Part of that information you may be unable to change (e.g., health-related PII, aka PHI) so, protecting it is of paramount importance.

Data synthetics can do this for you by creating new data very similar to existing data, thereby creating an unbridgeable gap between your PII and the data that is used by a predictive model, for example. This similarity can also help make these predictions relevant to you since the general underlying pattern (aka, the signal in the data) remains the same.

Plenty of brilliant A.I. professionals, be it scientists or engineers, have delved into this problem and have come up with mathematically elegant solutions. One such solution is Variational AutoEncoder (VAE, link to a comprehensive and somewhat comprehensible article on this topic), a kind of artificial neural network (ANN) that aims to figure out the underlying distributions of the data and create new data based on them. These distributions are a mathematical model aiming to describe the signal. Not the only one and probably not even the best one either, but it's good enough for something basic. The problem with VAEs (and other A.I. systems) is that they need sufficiently large datasets to figure out this signal and manifest it in new data. Additionally, building a VAE isn't so simple unless you understand the technology and the not-so-trivial math involved.

What if there was a way to develop synthetic data without utilizing A.I.? What if all you needed to know was the Math you learned in school and a few other things based on that Math, elegant but not overly sophisticated? Well, that's what I've done recently with sufficient success to consider this something usable and useful. This framework (which I call ROOF, hence the picture on the top) I developed in Julia 1.5, is low on computational resources and can be applied to any kind of continuous data (there is also a version for ordinal data though I imagine that's not something you care about that much). If you are in this sort of work or know someone who is, feel free to reach out to me. Cheers!

#PII #VAE #Individual #Value #Synthetics

Commenti

Articoli di Zacharias 🐝 Voulgaris

Visualizza il blog

1 anno fa · 2 min. di lettura

Potresti essere interessato a questi lavori

Internship Programme Italy, Litigation

Trovato in: Talent IT C2 - 2 giorni fa

DLA Piper Milan, Italia StageSHIP

The role · Sappiamo che stai costruendo il tuo futuro e, in DLA Piper, la nostra cultura inclusiva e solidale fa sì che la crescita personale vada di pari passo con lo sviluppo professionale. · Il nostro Programma di Internship dura 6 mesi e ti consente di svolgere la pratica for ...
It Functional Analyst

Trovato in: Buscojobs IT C2 - 3 giorni fa

Mantu Group Sa Biella, Italia

Who are we?Amaris Consulting is an independent technology consulting firm providing guidance and solutions to businesses. With more than 1,000 clients across the globe, we have been rolling out solutions in major projects for over a decade – this is made possible by an internatio ...
Associate & Senior Associate - Governance, Processes & Controls - Milano

Trovato in: Jooble IT O C2 - 4 giorni fa

PwC South Africa Milano, Italia

Consultant - Governance, Processes & Controls - Milano page is loaded Consultant - Governance, Processes & Controls - Milano Apply locations Milan time type Full time posted on Posted Yesterday job requisition id 464269WD Line of Service · Advisory Industry/Sector · FS X-Sector S ...

Zacharias 🐝 Voulgaris

Data Synthetics without A.I. and Why This Adds Value to You as an Individual

Commenti

Articoli di Zacharias 🐝 Voulgaris

What Makes the LaMDA AI Tick?

A couple of mini-surveys for you

Tackling Some Common Myths about Linux

Potresti essere interessato a questi lavori

Internship Programme Italy, Litigation

It Functional Analyst

Associate & Senior Associate - Governance, Processes & Controls - Milano

per i reclutatori

Informazioni