5.3.2修补异常值

通过直接删除的方式处理异常值,虽然是最直接方法的方法,但是会减少数据样本,因此在数据集小的情况下,减少数据样本会对结果产生影响;在含有较多异常值的数据集中,大量的删除异常值也会对结果产生影响。因此,当异常值没有可研究性的情况下,应该对这些异常值进行修补处理。 修补异常值的方式主要有两种,即改异常值和替换异常值。

1.案例介绍

通过Kettle工具,替换和修改数据表interpolation_data中的异常值。

2.数据准备

现在有一份500人的身高调查数据表interpolation_data,其中包括id、Gender和Height字段,具体数据内容如图所示(注:这里只截取了部分人的数据)。

异常数据python 异常数据的处理方法_开发语言

 3.具体步骤

(1)打开Kettle工具,创建转换

通过使用Kettle工具,创建一个转换fill_unusual_value,并添加“表输入”控件、“过滤记录”控件、“空操作(什么也不做)”控件、“设置值为NULL”控件、“合并记录”控件、“替换NULL值”控件、字段选择控件以及Hop跳连接线。

异常数据python 异常数据的处理方法_mysql_02

(2)配置表输入控件 

双击“表输入”控件,进入“表输入”配置界面,单击【新建】按钮,配置数据库连接。

异常数据python 异常数据的处理方法_异常数据python_03

 在SQL框中编写查询数据表interpolation_data的SQL语句,然后单击【预览】按钮,查看数据表interpolation_data的数据是否成功从MySQL数据库中抽取到表输入流中。

异常数据python 异常数据的处理方法_异常数据python_04

 

注:数据表interpolation_data要提前建立,以下代码仅供参考:

create table `interpolation_data` (
	`id` int (11),
	`Gender` varchar (30),
	`Height` double 
); 
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('1','Male','174');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('2','Male','189');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('3','Female','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('4','Female','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('5','Male','149');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('6','Male','189');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('7','Male','147');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('8','Male','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('9','Male','174');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('10','Female','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('11','Male','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('12','Female','159');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('13','Female','192');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('14','Male','155');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('15','Male','260');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('16','Female','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('17','Female','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('18','Male','140');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('19','Male','144');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('20','Male','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('21','Male','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('22','Female','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('23','Female','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('24','Male','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('25','Female','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('26','Female','151');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('27','Male','190');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('28','Male','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('29','Female','163');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('30','Male','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('31','Male','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('32','Male','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('33','Female','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('34','Female','160');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('35','Female','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('36','Female','189');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('37','Female','197');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('38','Male','144');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('39','Female','171');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('40','Female','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('41','Female','175');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('42','Female','149');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('43','Male','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('44','Male','161');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('45','Female','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('46','Male','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('47','Female','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('48','Male','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('49','Male','161');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('50','Male','140');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('51','Female','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('52','Female','176');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('53','Male','163');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('54','Male','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('55','Male','196');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('56','Female','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('57','Male','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('58','Male','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('59','Female','164');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('60','Male','143');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('61','Female','191');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('62','Female','141');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('63','Male','193');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('64','Male','190');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('65','Male','175');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('66','Female','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('67','Female','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('68','Female','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('69','Female','164');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('70','Female','194');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('71','Female','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('72','Male','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('73','Male','141');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('74','Male','180');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('75','Female','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('76','Female','197');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('77','Male','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('78','Female','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('79','Female','176');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('80','Male','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('81','Male','164');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('82','Female','166');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('83','Female','190');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('84','Male','186');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('85','Male','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('86','Male','198');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('87','Female','175');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('88','Male','145');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('89','Female','159');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('90','Female','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('91','Female','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('92','Female','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('93','Female','194');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('94','Male','177');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('95','Male','197');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('96','Female','170');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('97','Male','142');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('98','Male','160');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('99','Male','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('100','Female','190');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('101','Male','199');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('102','Male','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('103','Male','161');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('104','Female','198');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('105','Female','192');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('106','Male','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('107','Male','166');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('108','Male','159');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('109','Female','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('110','Male','149');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('111','Female','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('112','Female','146');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('113','Male','190');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('114','Female','192');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('115','Female','177');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('116','Male','148');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('117','Female','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('118','Female','146');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('119','Male','144');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('120','Female','176');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('121','Female','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('122','Male','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('123','Male','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('124','Female','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('125','Female','158');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('126','Male','158');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('127','Male','194');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('128','Female','145');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('129','Male','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('130','Male','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('131','Female','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('132','Female','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('133','Female','158');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('134','Female','167');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('135','Female','171');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('136','Female','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('137','Female','190');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('138','Male','194');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('139','Male','171');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('140','Male','159');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('141','Female','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('142','Female','167');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('143','Male','180');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('144','Male','163');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('145','Male','140');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('146','Male','197');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('147','Male','194');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('148','Female','140');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('149','Male','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('150','Female','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('151','Female','196');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('152','Male','140');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('153','Female','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('154','Female','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('155','Female','155');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('156','Female','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('157','Female','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('158','Male','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('159','Female','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('160','Male','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('161','Male','199');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('162','Female','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('163','Male','192');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('164','Female','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('165','Female','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('166','Male','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('167','Male','176');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('168','Female','156');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('169','Female','151');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('170','Female','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('171','Male','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('172','Male','174');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('173','Male','167');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('174','Female','196');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('175','Male','197');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('176','Female','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('177','Female','170');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('178','Female','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('179','Female','166');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('180','Male','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('181','Female','162');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('182','Male','177');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('183','Male','162');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('184','Male','180');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('185','Female','192');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('186','Male','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('187','Female','167');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('188','Female','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('189','Female','161');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('190','Male','158');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('191','Male','141');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('192','Male','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('193','Male','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('194','Female','142');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('195','Male','141');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('196','Male','145');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('197','Male','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('198','Female','177');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('199','Female','166');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('200','Male','193');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('201','Male','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('202','Male','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('203','Female','156');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('204','Male','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('205','Male','145');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('206','Female','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('207','Male','145');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('208','Female','196');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('209','Male','191');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('210','Female','148');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('211','Female','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('212','Male','148');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('213','Female','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('214','Female','196');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('215','Female','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('216','Female','171');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('217','Female','143');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('218','Female','142');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('219','Female','141');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('220','Male','159');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('221','Female','173');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('222','Male','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('223','Female','152');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('224','Male','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('225','Male','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('226','Female','155');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('227','Male','166');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('228','Male','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('229','Female','171');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('230','Male','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('231','Female','186');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('232','Female','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('233','Female','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('234','Female','177');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('235','Male','145');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('236','Male','170');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('237','Male','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('238','Male','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('239','Female','174');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('240','Female','146');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('241','Male','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('242','Male','166');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('243','Male','191');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('244','Female','177');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('245','Female','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('246','Male','151');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('247','Male','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('248','Female','142');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('249','Female','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('250','Male','161');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('251','Male','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('252','Male','140');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('253','Male','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('254','Female','162');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('255','Male','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('256','Female','162');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('257','Female','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('258','Female','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('259','Female','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('260','Female','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('261','Male','159');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('262','Male','163');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('263','Male','156');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('264','Female','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('265','Male','147');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('266','Male','141');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('267','Male','173');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('268','Male','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('269','Male','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('270','Male','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('271','Male','145');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('272','Male','152');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('273','Female','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('274','Female','163');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('275','Male','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('276','Female','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('277','Female','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('278','Male','190');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('279','Male','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('280','Male','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('281','Male','193');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('282','Female','147');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('283','Female','147');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('284','Male','166');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('285','Female','192');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('286','Male','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('287','Male','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('288','Male','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('289','Female','156');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('290','Male','149');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('291','Male','156');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('292','Male','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('293','Female','162');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('294','Female','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('295','Female','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('296','Male','160');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('297','Female','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('298','Female','140');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('299','Female','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('300','Male','151');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('301','Female','186');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('302','Male','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('303','Male','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('304','Male','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('305','Female','156');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('306','Male','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('307','Male','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('308','Male','144');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('309','Male','196');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('310','Male','171');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('311','Male','171');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('312','Female','180');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('313','Male','191');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('314','Female','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('315','Female','180');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('316','Female','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('317','Male','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('318','Male','142');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('319','Male','170');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('320','Male','152');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('321','Female','190');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('322','Female','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('323','Male','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('324','Male','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('325','Female','144');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('326','Female','148');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('327','Female','199');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('328','Female','167');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('329','Female','164');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('330','Female','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('331','Female','164');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('332','Male','142');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('333','Male','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('334','Female','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('335','Female','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('336','Male','155');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('337','Female','167');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('338','Female','164');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('339','Female','189');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('340','Female','161');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('341','Female','155');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('342','Female','171');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('343','Female','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('344','Male','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('345','Male','170');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('346','Female','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('347','Female','191');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('348','Male','162');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('349','Male','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('350','Female','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('351','Male','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('352','Male','197');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('353','Female','160');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('354','Male','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('355','Male','190');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('356','Male','174');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('357','Female','189');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('358','Female','186');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('359','Female','180');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('360','Female','186');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('361','Female','193');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('362','Male','161');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('363','Female','151');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('364','Female','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('365','Female','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('366','Male','141');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('367','Female','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('368','Female','186');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('369','Male','142');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('370','Female','147');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('371','Male','151');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('372','Female','160');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('373','Male','185');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('374','Female','163');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('375','Male','174');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('376','Female','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('377','Male','142');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('378','Male','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('379','Female','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('380','Male','176');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('381','Male','159');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('382','Male','191');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('383','Male','177');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('384','Male','151');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('385','Female','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('386','Female','197');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('387','Male','146');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('388','Female','160');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('389','Female','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('390','Female','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('391','Female','167');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('392','Female','180');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('393','Female','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('394','Female','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('395','Female','152');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('396','Female','164');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('397','Male','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('398','Male','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('399','Female','149');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('400','Male','163');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('401','Female','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('402','Male','174');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('403','Male','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('404','Male','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('405','Male','193');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('406','Male','148');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('407','Male','186');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('408','Male','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('409','Female','146');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('410','Female','166');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('411','Male','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('412','Female','177');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('413','Male','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('414','Female','161');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('415','Female','157');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('416','Female','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('417','Female','152');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('418','Female','162');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('419','Male','162');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('420','Female','177');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('421','Female','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('422','Male','140');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('423','Female','186');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('424','Female','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('425','Male','174');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('426','Female','180');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('427','Male','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('428','Female','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('429','Female','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('430','Female','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('431','Female','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('432','Female','163');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('433','Female','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('434','Male','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('435','Male','165');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('436','Male','168');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('437','Female','153');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('438','Male','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('439','Male','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('440','Female','166');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('441','Female','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('442','Male','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('443','Male','143');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('444','Male','152');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('445','Female','186');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('446','Male','159');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('447','Male','146');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('448','Female','176');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('449','Female','146');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('450','Male','159');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('451','Male','162');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('452','Female','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('453','Female','169');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('454','Male','182');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('455','Female','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('456','Male','176');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('457','Female','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('458','Female','175');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('459','Male','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('460','Female','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('461','Male','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('462','Male','152');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('463','Male','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('464','Female','145');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('465','Female','181');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('466','Male','158');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('467','Female','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('468','Male','145');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('469','Male','161');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('470','Male','198');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('471','Male','147');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('472','Male','154');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('473','Female','178');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('474','Male','195');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('475','Female','167');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('476','Male','183');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('477','Female','164');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('478','Male','167');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('479','Female','151');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('480','Female','147');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('481','Female','155');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('482','Female','172');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('483','Female','142');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('484','Male','146');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('485','Female','188');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('486','Male','173');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('487','Female','160');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('488','Male','187');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('489','Male','198');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('490','Female','179');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('491','Female','164');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('492','Female','146');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('493','Female','198');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('494','Female','170');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('495','Male','152');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('496','Female','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('497','Female','184');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('498','Female','141');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('499','Male','150');
insert into `interpolation_data` (`id`, `Gender`, `Height`) values('500','Male','173');

(3)配置过滤记录控件

双击“过滤记录”控件,进入“过滤记录”配置界面,在“条件”处设置过滤的条件,即设置Height字段的取值范围([114-226]),从而判断数据表中的每个数据是否为异常值。

异常数据python 异常数据的处理方法_开发语言_05

在“发送true数据给步骤:”处的下拉框中选择“空操作(什么也不做)2”,在“发送false数据给步骤:”处的下拉框中选择“空操作(什么也不做)”。

异常数据python 异常数据的处理方法_数据库_06

 

(4)预览“空操作(什么也不做)”控件中的数据 

选中“空操作(什么也不做)”控件,然后单击转换工作区顶部的 按钮,预览“空操作(什么也不做)”控件中的数据,id为15的这条数据,Height字段为260,260不在非异常值范围[114,226]内,因此该条数据为异常数据。

异常数据python 异常数据的处理方法_控件_07

(5)配置过滤记录控件 

双击“设置值为NULL”控件,进入“设置值为NULL”界面;在“字段”处添加要设为NULL值的字段名称和值。

异常数据python 异常数据的处理方法_异常数据python_08

 (6)配置合并记录控件

双击“合并记录”控件,进入“合并行(比较)”界面,在“旧数据源:”处的下拉框选择“设置为NULL值”,“新数据源:”处的下拉框选择“空操作(什么也不做)2”;在“匹配的关键字:”处,添加关键字段,即id。

异常数据python 异常数据的处理方法_控件_09

(7)配置替换NULL值控件 

双击“替换NULL值”控件,进入“替换NULL值”界面,勾选“选择字段”处的复选框,并在“字段”框添加字段为Hight,值替换为170。

异常数据python 异常数据的处理方法_控件_10

(8)配置字段选择控件 

双击“字段选择”控件,进入“选择/改名值”界面,在“移除”选项卡处添加要移除的字段名称,这里移除的是字段flagfield。

异常数据python 异常数据的处理方法_控件_11

(9)运行转换 

异常数据python 异常数据的处理方法_控件_12

 4.查看数据表interpolation_data中的异常值是否修改并替换

单击“字段选择”控件,再单击执行结果窗口的“Preview data”选项卡,查看是否修改并替换数据表interpolation_data中的异常值。

异常数据python 异常数据的处理方法_异常数据python_13